Load jenkins build can be time consuming and using it becomes not feasible when debugging. Having to make changes, then push a patch for build only, then reload the machine would be a waste of time.
Instead, it's easier to build locally, then replace the lnet.ko file or even install the rpms from scratch.
Below is the procedure to do that.
- setup the system to build as instructed here: https://wiki.hpdd.intel.com/pages/viewpage.action?pageId=52104622. Follow the steps in section:
- Provision machine and installing dependencies
- Preparing the Lustre Source
- You don't need to build lustre against a patched kernel if you're only using ldsikfs. So you can download and install the kernel-devel, kernel-debug and kernel-debug-common package:
- install the devel rpm (sometimes if you're going to downgrade the kernel, then you'll need to use the '–oldpackage' option):
- rpm -hiv --oldpackage kernel-devel*
- sh ./autogen.sh
- /configure --with-linux=/usrc/src/kernels/<kernel-release>/
- make rpms
- rpm -qa | grep lustre # list installed rpms
- rpm -e --nodeps <all of the existing rpms> # remove existing rpms reported by above command.
- rpm -ivh --nodeps <all the new rpms> # install the new rpms in your build directory.
- now we can make changes to LNet, run "make" then replace ./usr/lib/modules/<kernel release>/extra/lustre/net/lnet.ko with the new lnet.ko
2 Comments
Doug Oucharek
Ran into a problem following this for EE 3.0. For some reason, the "make rpms" is not making the "kmod*" rpms which are needed by weak module updates. Installing the modules, then, puts them where they cannot be found. When I look into /lib/modules, we have two versions of the kernel provisioned. We seem to be running the older one, but building the newer one. Without weak module updates in place, nothing is working properly.
Doug Oucharek
Another problem I have run into on the test cluster nodes is they have two versions of Linux installed by loadjenkinsbuild. When I look in the /usr/src/kernels directory, I see one version of Linux, but when I do a "uname", I see an older version of Linux. So, when I configure Lustre with what is in /usr/local/src, the resulting RPMs get installed for the newer kernel and not the one which is actually running.
The solution Amir found is to download the kernel source RPM for the running kernel and install that in /usr/src/kernels. Configure Lustre to use that. The resulting RPMs install in the correct place.
This has solved a lot of RPM issues for me.