Showing posts with label CentOS. Show all posts
Showing posts with label CentOS. Show all posts

Compiling GROMACS 2019.5 with Intel Compilers and CUDA on CentOS 7 and 8

All tests reported bellow are performed by using Intel Parallel Studio XE 2019 Update 4. I presume you may achieve the same positive results using older versions of the Intel Compilers package.

Notes on CentOS 7: You have to install the SCL repository and EPEL (that are MANDATORY requirements you cannot skip):

$ sudo yum install centos-release-scl centos-release-scl-rh epel-release

Then, using the new repositories, install dvetoolset-8 and cmake3:

$ sudo yum install devtoolset-8 cmake3

In the bash session, where the compilation process will take place, execute:

$ scl enable devtoolset-8 bash

To install the CUDA latest compiler and libraries, you might use the procedure described in my previous posts. The latest version of CUDA is 10.2. Once all the pre-requisites are satisfied, download the GROMACS 2019.5 source code tarball, unpack it, create a folder named "build" inside the source tree, enter it, load the Intel Compilers variables, then the SCL environment and run the compilation and installation process:

$ mkdir ~/build
$ cd build
$ wget ftp://ftp.gromacs.org/pub/gromacs/gromacs-2019.5.tar.gz
$ tar xvf gromacs-2019.5.tar.gz
$ cd gromacs-2019.5
$ mkdir build
$ cd build
$ source /opt/intel/compilers_and_libraries/linux/bin/compilervars.sh intel64
$ CC=icc CXX=icpc CFLAGS="-xHost" CXXFLAGS="-xHost" cmake3 .. -DCMAKE_INSTALL_PREFIX=/usr/local/appstack/gromacs-2019.5-icc -DGMX_MPI=OFF -DGMX_BUILD_OWN_FFTW=OFF -DGMX_GPU=ON -DCUDA_TOOLKIT_ROOT_DIR=/usr/local/cuda -DGMX_FFT_LIBRARY=mkl
$ make -j6 install

The product of the compilation will be installed in /usr/local/appstack/gromacs-2019.5-icc. You might change that destination by set another value to -DCMAKE_INSTALL_PREFIX.


Using TLSv1.3 and strong cryptography for 389 Directory Server (on CentOS 7)

TLS v1.3 support came recently to the 389 Directory Server (via mod_nss) with the latest CentOS 7. Due to an inconstancy between EPEL and CentOS7 upstream, TLS v1.3 is not currently available to the dirsrv admin service!

To activate the TLS v1.3 protocol for 389 Directory Server, do prepare a LDIF file, and describe there the modification that will take place in "cn=encryption,cn=config" (a dn-object which is a part of the 389 start up configuration), and insert there the following text:

dn: cn=encryption,cn=config
changetype:modify
replace: sslVersionMin
sslVersionMin: TLS1.2

dn: cn=encryption,cn=config
changetype: modify
replace: nsSSL3
nsSSL3: off

dn: cn=encryption,cn=config
changetype: modify
replace: nsSSL3Ciphers
nsSSL3Ciphers:  +TLS_CHACHA20_POLY1305_SHA256,+TLS_AES_256_GCM_SHA384,+TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384,+TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA384,+TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,+TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305_SHA256,+TLS_DHE_RSA_WITH_AES_256_GCM_SHA384

Save that content as "modify.ldif", then invoke ldapmodify, and authenticate as "cn=Directory Manager" to enforce the modification:

$ ldapmodify -D "cn=Directory Manager" -x -W -f modify.ldif

In case of successful modification, the following message will appear:

modifying entry "cn=encryption,cn=config"

At this point you need to restart 389 Directory Server:

systemctl reload dirsrv@instance-name

and check if the requested cipher suite, already requested for TLS v1.2 above, is really available to the server (replace "localhost" with the actual server name of your 389 server):

$ nmap -sV --script ssl-enum-ciphers -p 636 localhost

In latest Fedora, CentOS, and Ubuntu, one can use openssl (>= 1.1.1) to verify that that TLS v1.3 is successfully configured and (therefore) available (just replace "server-name" down bellow with your actual server name):

$ openssl s_client -connect server-name:636

This is how a positive result, the one detecting the presence of TLS v1.3, will appear on your screen:

New, TLSv1.3, Cipher is TLS_CHACHA20_POLY1305_SHA256
Server public key is 4096 bit
Secure Renegotiation IS NOT supported
Compression: NONE
Expansion: NONE
No ALPN negotiated
Early data was not sent

SmartCard-HSM USB token: Using Smart Card Shell 3 for initializing and configuring the token, generating key pairs, and importing keys and X.509 certificates from external PKCS#12 containers

Content:

  1. Introduction
  2. Prerequisites
  3. Downloading and installing Smart Card Shell 3
  4. Running Smart Card Shell 3 GUI
  5. Loading the key manager in Smart Card Shell 3 GUI
  6. Initializing the token and configuring DKEK to enable the import of keys and X.509 certificates from PKCS#12 files
  7. Generating ECC key pair (by means of DKEK shares)
  8. Importing key pair and the corresponding X.509 certificate from PKCS#12 file into the token (by means of DKEK shares)

 

1. Introduction

The SmardCard-HSM (Standard-A USB) token (you can order it online):

is a reliable, fast, secure, and OpenSC compatible HSM device, for generating, storing, and importing RSA, AES, and Elliptic Curve (EC) keys and X.509 certificates. Maybe, the best feature of the device is its enhanced support for EC (up to 521-bit keys), and the ability to import key pairs and certificates from PKCS#12 containers. Later allows to clone a key pair into several token devices, as a hardware backup scenario.

Unfortunately, the vendor does not provide (yet) a comprehensive documentation for end users, describing in details the specific process of importing key pairs and X.509 certificates from PKCS#12 containers (files) into the token (which is something very much in demand). Therefore, the goal of this document is to fix (at least partially) that gap in the documentation.

Note, that the procedures, described above, are not part of the everyday practice. They are required only for initializing the token device, generating EC keys for curves that are not currently listed as supported in the token's firmware (for instance, secp384r1 curve is supported by the token's processor, but not listed as supported in the firmware and the OpenSC based tools cannot request secp384r1 key generation), and to import key pairs and X.509 certificates from PKCS#12 files.

 

2. Prerequisites

To be able to follow the steps given below, you need to have installed and updated Linux distribution, running a Graphical Desktop environment (GNOME, KDE). Recent OpenJDK (17 is recommended if available) must be installed and kept updated. Do not install OpenJDK manually, since it is an essential software package! Always use the package manager, provided by the vendor of your Linux distribution to install or update OpenJDK:

  • RHEL7, CentOS 7, Scientific Linux 7:

    # yum install java-11-openjdk.x86_64
  • Fedora (current), RHEL8/9, CentOS 8, Scientific Linux 8, Rocky Linux 8/9, Alma Linux 9:

    # dnf install java-17-openjdk.x86_64
  • Ubuntu:

    # apt-get install openjdk-17-jdk-headless

It is not a good idea to configure the HSM token and manage its content on a system, that is used for social networking, software testing, gaming, or any other activity, that might be considered risky in this case. Always use a dedicated desktop system (or dedicated Linux virtual machine) for managing your PKI infrastructure.

You might have more than one version of OpenJDK installed on your system. So the first step is to check that and set the latest OpenJDK as a default Java provider. Execute the following command line (be super user or root):

# alternatives --config java

to check how many Java packages (provides) are installed and available locally, and which one of them is set as current default. For example, the following result:

There are 2 programs which provide 'java'.

  Selection    Command
-----------------------------------------------
*+ 1           java-11-openjdk.x86_64 (/usr/lib/jvm/java-11-openjdk-11.0.19.0.7-1.el9_1.x86_64/bin/java)
   2           java-17-openjdk.x86_64 (/usr/lib/jvm/java-17-openjdk-17.0.7.0.7-1.el9_1.x86_64/bin/java)


Enter to keep the current selection[+], or type selection number:

means there are two OpenJDK packages installed, and the first one is set a default Java provider (see which is the entry marked with "+" in the first column). To set OpenJDK 17 default Java provider, type the ID number assigned to the package in the list (in the "Selection" column) and press "Enter" afterwards (in the above example, the ID used is 2):

Enter to keep the current selection[+], or type selection number: 2

It is always a good idea to check if the symlinks created by the alternatives tool points to the correct target. The simplest way to do so for OpenJDK 11 is to follow the symlink /etc/alternatives/java:

$ ls -al /etc/alternatives/java

and verify that the target is the OpenJDK 11 java executable:

lrwxrwxrwx. 1 root root 63 Apr 29 13:58 /etc/alternatives/java -> /usr/lib/jvm/java-17-openjdk-17.0.7.0.7-1.el9_1.x86_64/bin/java

Also check if the Java major version of the target:

$ java --version

is 17:

openjdk 17.0.7 2023-04-18 LTS
OpenJDK Runtime Environment (Red_Hat-17.0.7.0.7-1.el9_1) (build 17.0.7+7-LTS)
OpenJDK 64-Bit Server VM (Red_Hat-17.0.7.0.7-1.el9_1) (build 17.0.7+7-LTS, mixed mode, sharing)

Also, have pcscd running.

 

3. Downloading and installing Smart Card Shell 3

Be sure you have OpenJDK 17 installed, as specified above. Then visit the web page:

https://www.openscdp.org/scsh3/download.html

click on "IzPack Installer" link, and save locally the provided JAR archive of the installer.

Decide what kind of installation of Smart Card Shell 3 do you really need - to allow all users of the system to run the program code (run the installer as super user), or to limit that ability to a certain unprivileged user (perform the installation using that particular user ID):

  • run the installer as super user (root):

    You should install the program into a system folder, where the users can only read and execute the Java code (no write access should be given by default). That kind of restrictions will protect the executable code from deletion of modification.

  • run the installer as a non-privileged user:

    In this case, the simplest solution is to install the program into the home folder of the user. That type of installation is recommended only for a user, who really understands how to keep the executable code safe.

If followed, the steps given bellow will install the executable code of the program in the home folder of those user, who is executing the installer.

Open a terminal and type:

$ java -jar /path/to/scsh3.XX.YYY.jar

(here XX and YYY are numbers, unique for the current version). The following window will appear (press there the button "Next" to continue):

Select the installation folder (use the button "Browse" to change it, if you do not like the one suggested by the installer), and press "Next":

Now you will be able to see the progress of the installation process (press the button "Next" to continue, when it is done):

Next, you need to decide whether or not to create a shortcut to the program in the GNOME "Applications" menu (it is recommended to create such a shortcut), and who will be able to invoke the installed program (the last is useful only if you install the software as super user or root into a system folder). Press the button "Next":

and in the last window of the installer, press "Done" to exit:

Important note for those who are running Smart Card Shell 3 on RHEL9 (Rocky Linux 9, Alma Linux 9)!

Smart Card Shell 3 needs the library libpcsclite.so, but no package provides libpcsclite.so on RHEL9. To overcome that issue, install the package pcsc-lite-libs (if it is not already installed) and create the symlink /usr/lib64/libpcsclite.so that points to /usr/lib64/libpcsclite.so.1:

cd /usr/lib64
ln -s libpcsclite.so.1 libpcsclite.so

 

4. Running Smart Card Shell 3 GUI

Be sure the Smart Card Shell 3 GUI is installed. Expand the "Applications" menu (1), go to "Programming" (2), press there "Smart Card Shell 3" (3):

and wait for the appearance of the main window of the program:

During the first run, a new window might appear, asking for configuring the path to a working directory, where the output files will be stored by default. Click the "Browse" button:

select the folder (1) and press "Open" (2) to go back:

Thus path to the folder will appear in the text field (next to the "Browse" button). In addition, mark at least "Use this as the default and do not ask again", to complete the configuration and pressing "OK" to exit:

 

5. Loading the key manager in Smart Card Shell 3 GUI

Run Smart Card Shell 3 GUI. The key manager is a loadable script, dedicated to manage the objects in the token.

To load it, either expand "File" menu and select there "Key Manager":

or press "Ctrl+M". Once loaded, the key manager will check if the token is connected and will create in the main window a tree of those objects, it discovered in the token. Details about all important events will be reported in the "Shell" tab:

 

6. Initializing the token and configuring DKEK to enable the import of keys and X.509 certificates from PKCS#12 files

The goal of the initialization process, is to enable the import (export too) of keys and X.509 certificates, stored in files (most often PKCS#12 files), into the token, , based on "device-key-encryption-key" (DKEK) type of store. Note that DKEK is not enabled by default.

WARNING! DURING THE INITIALIZATION, ALL DATA, STORED IN THE TOKEN, WILL BE LOST!

To start with the initialization, run the Smart Card Shell 3, load the key manager script, click once with the right button of the mouse upon "SmartCard-HSM" (that is the root of the key manager tree), and select "Initialize Device" in the menu:

Supply the following information (or press "Cancel" to terminate the initialization):

  • The actual SO-PIN for configuring the token. The default SO-PIN code is 3537363231383830, unless it has been changed (if you forget the SO-PIN, consider the token lost). Press "OK" to continue:

  • The label of the token (that is the token's friendly name, displayed in the PKI applications). Press "OK" to continue:

  • The authentication mechanism for restricting the access to the objects and processor of the token. In most cases, you might need to select "User PIN" (you may set another authentication mechanism, but this one is the most poplar one). Press "OK" to continue:

  • The way to restore the access to the keys and X.509 certificates, if the PIN is lost, forgotten, or locked (if a wrong PIN is entered more than 3 times, consecutively). Select "Resetting PIN with SO-PIN allowed" and press "OK" (select "Resetting PIN with SO-PIN not allowed" only in specific cases, where the implementation of such policy is necessary):

  • The new PIN code (do not use the number shown in the picture bellow). Press "OK" to continue:

  • The new PIN code (again, for confirmation). Press "OK" to continue:

  • Request for using "DKEK Shares". Press "OK" to continue:

  • The number of DKEK shares (use 1, unless you are expert). Press "OK" to continue:

  • Press "Cancel" here (if you press "OK" both SO-PIN and PIN codes will be stored locally in an unencrypted file):

After the success of the initialization, you will see only three objects displayed in the key manager tree: User PIN, SO-PIN, and DKEK entry. The message "Initializing complete" (an indication that the requested initialization has been successfully completed) will be seen in the "Shell" tab:

Note, that at this point, the requested DKEK shares are not yet initialized or/and imported to the token! The appearance of "DKEK set-up in progress with 1 of 1 shares missing" in the key manager tree indicates that. You need to request manually the creation of DKEK shares file and import its content to the token, by following strictly the instructions given bellow:

  • Request the creation of DKEK share, by clicking once with the right button of the mouse on the root of the key manager tree (on "SmartCard-HSM) and picking "Create DKEK share" in the menu:

  • Enter the name of DKEK file to create, and press "OK" (store the file in the working directory of the program, configured during the first run):

  • Set the password for protecting the DKEK share file content and press "OK":

  • Confirm the password for protecting the DKEK share file content and press "OK":

  • With the left button of the mouse, click once upon the object named "DKEK set-up in progress with 1 of 1 shares missing", displayed in the key manager section of the main window of the program:

  • Use the button "Browse" to find and choose the created DKEK file (file extension is *.pbe), and press "OK":

  • Enter the password set (before) for protecting the content of the DKEK file:

It will take up to 10 seconds to derive the keys and import the DKEK into the token (watch the related messages, appearing in the "Shell" tab). At the end, you will see that the object "DKEK set-up in progress with 1 of 1 shares missing" (in the key manager tree) will be renamed (the new name will include the ID of the DKEK object):

IMPORTANT! At this point, you need to store a copy of the DKEK share file, generated during the initialization, in a safe place!

In the examples above, that file is /home/vesso/CardContact/2019_02.pdb, but in you case the file will be with different name and location.

 

7. Generating ECC key pair (by means of DKEK shares)

IMPORTANT! Be absolutely sure that no application, other than Smart Card Shell 3, is communicating with the token. Stop all running processes of PKCS#11 compatible software (like Mozilla Firefox, Mozilla Thunderbird, Google Chrome, XCA), that might take over the token.

Start the Smart Card Shell 3, plug the token into the USB port, and load the key manager script. Be sure that the token is initialized properly, to support DKEK shares.

Click once on the token name, in the key manager section, with the right button of the mouse and select "Generate ECC Key" in the menu:

Select the elliptic curve type, and press "OK":

Provide a friendly name (alias) for the key pair (it is an internal name for naming the key pair object in the token) and press "OK":

Type (separated by comma) the list of the hex codes of the signature algorithms, that will be allowed for the signing, when using the encryption key (the most commonly used ones, 73,74,75, are given in the example bellow), then press "OK":

Wait until the token is finishing with the generation of the requested key pair. Once ready, you will see the new key object under the tree of the DKEK share:

 

7. Importing key pair and the corresponding X.509 certificate from PKCS#12 file into the token (by means of DKEK shares)

IMPORTANT! Be absolutely sure that no application, other than Smart Card Shell 3, is communicating with the token. Stop all running processes of PKCS#11 compatible software (like Mozilla Firefox, Mozilla Thunderbird, Google Chrome, XCA), that might take over the token.

Otherwise the PKCS#12 import might fail, by rising (in the "Shell" tab) the following error:

GPError: Card (CARD_INVALID_SW/27010) - "Unexpected SW1/SW2=6982 (Checking error: Security condition not satisfied) received" in ...

Start the Smart Card Shell 3, plug the token into the USB port, and load the key manager script. Be sure that the token is initialized properly, to support DKEK shares. Using DKEK share in this case is mandatory for this operation.

  • Click once on the token name in the key manager section, with the right button of the mouse, and select "Import from PKCS#12" in the menu:

  • Specify the number of DKEK shares to use (use 1, if you follow the recipes provided in this document), and click "OK":

  • Select the file, containing the DKEK shares (use the file name created during the initialization), and click "OK":

  • Enter the password for decrypting the DKEK file (that password is set during the creation of the file), click "OK", and wait up to 10 seconds for generating the shared keys:

  • Select the PKCS#12 file and click "OK":

  • Provide the password, set for protecting the content of the PKCS#12 file, and click "OK":

  • Select the key pair and X.509 certificate to import from the PKCS#12 file, by choosing their internal PKCS#12 name, and click "OK":

  • Enter a name to assign to the imported key pair and X.509 certificate, and click "OK":

  • Click "OK" if you wanna import more key pairs and X.509 certificates, stored in the same PKCS#12 file, or click "Cancel" to finish:

If the import is successful, you will see the key pair imported into the DKEK share (in the key manager section), and information about the process, in the "Shell" section (as shown in the red frames bellow):

IMPORTANT! You cannot import X.509 certificate chain from a PKCS#12 container into the token, by using the procedure proposed above.

But you might do that later, by using pkcs11-tool, it that is really necessary. Notice, that the X.509 certificates in the chain are public information and they might be used in out-of-the-box manner (installed in the software certificate repository of the browser, which will be using with the token). Their presence in the token storage is not mandatory.


Using SmartCard-HSM 4K USB-Token to work with ECDSA keys in RHEL, CentOS, and Enterprise Linux, versions 7 and 8

Content:

1. Introduction

2. Connecting the device

3. Downloading, compiling, and installing the last stable release of OpenSC and HSM code

4. Initializing the device (setting the PINs)

5. Generating 521-bit EC key pair

6. Using the USB token as OpenSSL PKCS#11 engine

7. Generating CSR, based on a key pair previously stored in the token, through OpenSSL PKCS#11 engine

8. Importing X.509 certificate to the token storage

 

1. Introduction

In the modern cryptographic systems, the use of Elliptic Curve cryptography (ECC) becomes de facto a standard. Meanwhile, even now (it is 2019), it is quite rare to find a cryptographic token that supports the generation and use of ECDSA keys, up to 512 bits, SHA512 hash function (embedded in the processor of the token), and comes with a driver for Linux OS support. Most of the available tokens support mostly RSA key pairs, and even if they support ECDSA, that support is partial and the maximum key size is limited bellow 512 bits.

Surprisingly, you can find a reliable crypto token, that supports ECDSA 512-bit keys and SHA512 at:

cardomatic.de

The SmartCard-HSM 4K USB-Token, they have in sale, is supported by the driver library libccid.so, which is part of the OpenSC package (version 0.16 or higher). If one can download, compile, and install locally both the latest OpenSC and HSM code release, that is sufficient to employ the SmartCard-HSM 4K USB-Token in OpenSSL (as an engine), Mozilla Firefox, and Mozilla Thunderbird (as a security device).

The notes given bellow explain how to make the SmartCard-HSM 4K USB-Token device available in a Linux-based system running CentOS 7, Scientific Linux 7, Red Hat Enterprise Linux 8, and (possibly) Red Hat Enterprise Linux 8, how initializing the device, and the modules to interact the opensc tools (like pkcs11-tools and pkcs15-tools).

 

2. Connecting the device

Install the following packages (there is a good chance they are already installed):

# yum install pcsc pcsc-lite-ccid opensc

Enable the daemon pcscd to be loaded when starting/restarting the system:

# systemctl enable pcscd

and run it manually afterwards:

# systemctl start pcscd

Now you can connect the SmartCard-HSM 4K USB-Token device to the system. Records like those given bellow will be added to the /var/log/messages by syslog:

May 10 15:00:06 argus kernel: usb 3-10: new full-speed USB device number 9 using xhci_hcd
May 10 15:00:06 argus kernel: usb 3-10: New USB device found, idVendor=04e6, idProduct=5816
May 10 15:00:06 argus kernel: usb 3-10: New USB device strings: Mfr=1, Product=2, SerialNumber=5
May 10 15:00:06 argus kernel: usb 3-10: Product: uTrust 3512 SAM slot Token
May 10 15:00:06 argus kernel: usb 3-10: Manufacturer: Identiv
May 10 15:00:06 argus kernel: usb 3-10: SerialNumber: 55511247701280

Their appearance there does not exactly mean the device is accessible by the applications. It only indicates that the USB device is properly recognized by libusb. In addition, the device will appear under the Vendor Name: SCM Microsystems, Inc., Vendor ID: 04e6, and Product ID: 5816, in the output generated by the execution of lsusb:

Bus 003 Device 009: ID 04e6:5816 SCM Microsystems, Inc.

If detailed debugging information, regarding the process of device detection, performed by pcscd daemon, is required, do modify the content of the file /usr/lib/systemd/system/pcscd.service (used by systemd to invoke the daemon pcscd), by adding the additional declaration --debug to the end of the line starting with "ExecStart":

ExecStart=/usr/sbin/pcscd --foreground --auto-exit --debug

Save the changes to the file, load the new list of declarations:

# systemctl daemon-reload

and finally restart the daemon pcscd:

# systemctl restart pcscd

At that moment, start monitoring the new content appended to the file /var/log/messages (# tailf /var/log/messages will do), then unplug and plug in the device, and watch what kind of messages the syslog daemon is appending to the file.

To prevent the fast growth of the /var/log/messages size on the disk, immediately after finishing with the analysis of the debug information, do restore the original declarations in /usr/lib/systemd/system/pcscd.service (remove the additional declaration --debug), reload it, and restart the daemon pcscd!

 

3. Downloading, compiling, and installing the last stable release of OpenSC and HSM code

First, install the required RPM packages (if they are not already installed):

# yum install git gcc gcc-c++ make

Then fetch the code of sc-hsm-embedded project, hosted on Github, configure, compile, and install it (change the destination folder after --prefix, if needed):

$ git clone https://github.com/CardContact/sc-hsm-embedded.git
$ cd sc-hsm-embedded
$ ./configure --prefix=/usr/local/sc-hsm-embedded
$ make
$ sudo make install

General note: Red Hat Enterprise Linux 8 (as well as CentOS 8 and Scientific Linux 8), ships OpenSC version 0.19. That means one does not need to compile OpenSC code, as explained bellow, in systems running EL8.

Once the installation is successfully completed, download, compile, and install the latest stable release of the OpenSC code (change the destination folder after --prefix and --with-completiondir, if needed):

$ wget https://github.com/OpenSC/OpenSC/releases/download/0.19.0/opensc-0.19.0.tar.gz
$ tar xvf opensc-0.19.0.tar.gz
$ cd opensc-0.19.0
$ ./configure --prefix=/usr/local/opensc-0.19.0 --with-completiondir=/usr/local/opensc-0.19.0/share/bash-completion/completions
$ make
$ sudo make install

The next step is to add the folders containing the installed binaries and libraries to the user's PATH and LD_LIBRARY_PATH environmental variables. Those additions can be made either system-wide (available to all users, inclusive) or user-specific (available only to specific users, exclusive):

  • system-wide:

    Write down the following export declarations:

    export PATH=/usr/local/opensc-0.19.0/bin:$PATH
    export LD_LIBRARY_PATH=/usr/local/sc-hsm-embedded/lib:/usr/local/opensc-0.19.0/lib:$LD_LIBRARY_PATH

    to the file /etc/profile.d/opensc.sh (create the file, it does not exist by default there).

  • user-specific:

    Open each ~/.bashrc file and append (to the end) the following lines:

    export PATH=/usr/local/opensc-0.19.0/bin:$PATH
    export LD_LIBRARY_PATH=/usr/local/sc-hsm-embedded/lib:/usr/local/opensc-0.19.0/lib:$LD_LIBRARY_PATH

 

4. Initializing the device (setting the PINs)

The SmartCard-HSM 4K USB-Token device comes uninitialized! That means the device cannot be used, unless complete the initialization process first.

Connect the token to the system and invoke the tool sc-hsm-tool by specifying the SO PIN (SO stands for "Security Officer", change SO PIN 3537363231383830, specified in the example bellow, with another unique 16-digit long decimal number and do not forget it) and the user's PIN (change 123456 to another number):

$ sc-hsm-tool --initialize --so-pin 3537363231383830 --pin 123456

Here, the number after --so-pin is the SO PIN and the one after --pin is the user's PIN. Note that the SO PIN is the most important PIN, for it allows write access to the all slots of the device storage. Also, it can change every PUK and PIN, previously stored there (that includes the SO PIN itself). So do not loose SO PIN number!


5. Generating 521-bit EC key pair

Invoke the tool pkcs11-tool and specify the key type (EC:specp521r1 for 521-bit EC encryption key) and key pair ID label (it helps later to access the keys):

$ pkcs11-tool --module libsc-hsm-pkcs11.so -l --keypairgen --key-type EC:secp521r1 --id 10 --label "EC pair ID 10"

The user's PIN will be requested next, to grant access to the device storage and processor:

Enter PKCS#11 token PIN for SmartCard-HSM:

If the provided user's PIN is correct, it will take up to 2-3 seconds to execute and complete the key generation process.

The most important part of the command line above is the ID number assigned to the key pair (given above is 10, but you may reserve some other number, including hexadecimal number syntax, which is not already taken), because from that moment onward, the keys of that pair can be selected and involved in operations only if their ID number is specified.

 

6. Using the USB token as OpenSSL PKCS#11 engine

Install the package openssl-pkcs11 to enable the OpenSSL PKCS#11 engines functionality:

# yum install openssl-pkcs11

After that, store the following minimal OpenSSL configuration into a local file (openssl_hsm.cnf, for the sake of our illustration process):

openssl_conf = openssl_def

[openssl_def]
engines = engine_section

[engine_section]
pkcs11 = pkcs11_section

[pkcs11_section]
engine_id = libsc-hsm-pkcs11
dynamic_path = /usr/lib64/engines-1.1/pkcs11.so
MODULE_PATH = /usr/local/opensc-0.19.0/lib/libsc-hsm-pkcs11.so
init = 0


[ req ]
distinguished_name      = req_distinguished_name
string_mask = utf8only

[ req_distinguished_name ]

The only declaration in openssl_hsm.cnf one might need to change is that of MODULE_PATH:

MODULE_PATH = /usr/local/opensc-0.19.0/lib/libsc-hsm-pkcs11.so

Replace /usr/local/opensc-0.19.0 there with the actual folder, where the file libsc-hsm-pkcs11.so is stored. Note that the file is brought there during the installation of the HSM code).

 

7. Generating CSR, based on a key pair previously stored in the token, through OpenSSL PKCS#11 engine

If the OpenSSL PKCS#11 engine configuration file is available, the generation of CSR PKCS#10 block requires to specify the key pair ID (that is the number assigned to the key pair during the process of key pair generation):

$ openssl req -new -subj "/CN=server.example.com" -out request.pem -sha512 -config openssl_hsm.cnf -engine pkcs11 -keyform engine -key 10

One can expand both command line arguments and the set of declarations stored in openssl_hsm.cnf, if the CSR should contain some specific extensions, requested by the certificate issuer.

 

8. Importing X.509 certificate to the token storage

That type of operation is possible only if the X.509 certificate is stored in DER format (binary). For example, if the file SU_ECC_Root_CA.crt contains PEM formatted X.509 certificate, that certificate might be converted into DER format, by using openssl:

$ openssl x509 -in SU_ECC_Root_CA.crt -outform DER -out SU_ECC_Root_CA.der

To import the DER formatted X.509 certificate to the USB storage token, use the tool pkcs11-tool:

$ pkcs11-tool --module ibsc-hsm-pkcs11.so --login --write-object SU_ECC_Root_CA.der --type cert --id `uuidgen | tr -d -` --label "SU_ECC_ROOT_CA"

The label string after --label, naming the X.509 certificate, should be specified in a way hinting the purpose of having and using that certificate.


Installing Intel Python 3, tensorflow-gpu, and multiple versions of CUDA and cuDNN on CentOS 7

Content:

1. Introduction

2. Enabling the use of EPEL repository

3. Installing multiple CUDA versions on CentOS 7

4. Monitoring the NVidia GPU device by nvidia-smi

5. Making the software state in the NVIDIA driver persistent

6. Installing cuDNN for multiple versions of CUDA

7. Installing Intel Python 3 and tensorflow-gpu

8. Testing the CUDA and cuDNN installation

8.1. Testing if cuDNN library is loadable

8.2. Testing the CUDA Python 3 integration by using Numba

8.3. Testing the CUDA Python 3 integration by using tensorflow-gpu


This publication describes how to install multiple versions of CUDA and cuDNN on the same system running CentOS 7 to support various applications, and tensorflow in particular (via tensorflow-gpu). The recipes provided bellow can be follow when adding GPU computing support for compute nodes, which are part of HPC cluster.


This is an optional step, applicable if the configuration for using EPEL repository is not presented in /etc/yum.repos.d. EPEL is required here because the installation of the nvidia graphics driver, part of CUDA packages, requires the presence of DKMS in the system in advance. That package is included in EPEL. To use EPEL install first its repository package:

# yum install epel-release
# yum update

The dkms RPM package will be installed later, as a dependence required by the CUDA packages (see next section).


The most reasonable question here is why do we need multiple version of CUDA installed and supported locally on the system. Its answer is straightforward - it is all about the application software specific requirements. Some software products are very specific about the version of CUDA.

The most rational way to install the CUDA packages on CentOS 7 is through yum. NVidia provides the configuration files for using their yum repositories as a separate RPM package, which might be downloaded here:

https://developer.nvidia.com/cuda-downloads

To initiate the download consequently select Linux > x86_64 > CentOS > 7 > rpm (network) > Download as shown in the screen shots bellow:


and installed by following the instructions given bellow the "Download" button.

From time to time some inconsistencies appear in the CUDA yum repository. To prevent any problems they might cause edit the file /etc/yum.repos.d/cuda.repo by changing there the line:

enabled=1

into

enabled=0

From now on, every time an access to the CUDA repository RPM packages is required, do supply the command line option --enablerepo=cuda to yum.

After finishing with the yum configuration install the RPM packages containing the versions of CUDA currently supported by the vendor:

# yum --enablerepo=cuda install cuda-8-0 cuda-9-0 cuda-9-1

That will install plenty of packages. Take into account their installation size and prepare to meet that demand for disk space.

If, by any chance, the installer misses to install the packages nvidia-kmod, xorg-x11-drv-nvidia, xorg-x11-drv-nvidia-libs, and xorg-x11-drv-nvidia-gl, install them separately:

# yum --enablerepo=cuda install nvidia-kmod xorg-x11-drv-nvidia xorg-x11-drv-nvidia-libs xorg-x11-drv-nvidia-gl

The tool nvidia-smi is part of the package xorg-x11-drv-nvidia. It shows the current status of the NVidia GPU device:

$ nvidia-smi

Sun May  6 17:15:10 2018
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 390.30                 Driver Version: 390.30                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Quadro K620         On   | 00000000:02:00.0 Off |                  N/A |
| 34%   36C    P8     1W /  30W |      1MiB /  2000MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

That tool is useful to check how many applications are currently running on the GPU device, what is the temperature there, the consumed power, the utilization rate, and what amount of memory is taken by the applications.


To prevent the driver from releasing the NVidia GPU device, when that device is not in use by any process, the daemon nvidia-persistenced (part of the package xorg-x11-drv-nvidia) needs to be enabled and started:

# systemctl enable nvidia-persistenced
# systemctl start nvidia-persistenced

The cuDNN library and header files can be downloaded from the web page of the vendor at:

https://developer.nvidia.com/cudnn

Note that a proper user registration is required to obtain the cuDNN files. Also, you need to download the archives with cuDNN library and header files for each and every CUDA version locally installed and supported. Which process, in turn, will end up bringing the following files into the download directory:

cudnn-8.0-linux-x64-v6.0.tgz
cudnn-9.0-linux-x64-v7.tgz
cudnn-9.1-linux-x64-v7.tgz

To proceed with the installation, unpack the content of the archives into the respective CUDA installation folders and recreate the database with the dynamic linker run time bindings, by executing (as root or super user) the command lines:

# tar --strip-components 1 -xf cudnn-8.0-linux-x64-v6.0.tgz -C /usr/local/cuda-8.0
# tar --strip-components 1 -xf cudnn-9.0-linux-x64-v7.tgz -C /usr/local/cuda-9.0
# tar --strip-components 1 -xf cudnn-9.1-linux-x64-v7.tgz -C /usr/local/cuda-9.1
# ldconfig /

It is recommended to check the successful archive unpacking and the proper recreation of the database with the dynamic linker run time bindings, by listing the database cache and grep the output for locating the string "cudnn" in it:

$ ldconfig -p | grep cudnn

The grep result indicating successful cuDNN installation, will look like:

libcudnn.so.7 (libc6,x86-64) => /usr/local/cuda-9.0/targets/x86_64-linux/lib/libcudnn.so.7
libcudnn.so.7 (libc6,x86-64) => /usr/local/cuda-9.1/targets/x86_64-linux/lib/libcudnn.so.7
libcudnn.so.6 (libc6,x86-64) => /usr/local/cuda-8.0/targets/x86_64-linux/lib/libcudnn.so.6
libcudnn.so (libc6,x86-64) => /usr/local/cuda-8.0/targets/x86_64-linux/lib/libcudnn.so
libcudnn.so (libc6,x86-64) => /usr/local/cuda-9.0/targets/x86_64-linux/lib/libcudnn.so
libcudnn.so (libc6,x86-64) => /usr/local/cuda-9.1/targets/x86_64-linux/lib/libcudnn.so

Do not become confused due to the multiple declarations made for libcudnn.so in the database (as seen in the output above). Seemingly, that indicates a collision, but note that each of libcudnn.so files is a symlink and it also provides an unique version number. That number is used by the tensorflow libraries to find which of the files matches best the version requirements.


If Intel Python 3 is not available in the system, follow the instructions given here:

https://software.intel.com/en-us/articles/installing-intel-free-libs-and-python-yum-repo

on how to install it. It is a single RPM package (mind its large installation size of several gigabytes) which contains tensorflow but (currently) does not include tensorflow-gpu module. Once Intel Python 3 is available the tensorflow-gpu module could be installed by invoking pip (the one provided by Intel Python 3).

Do not install tensorflow-gpu or any other module for Intel Python 3 as root or super user. Avoid any module installations inside the /opt/intel/intelpython3/ folder. Instead, perform the installation as unprivileged user and append the --user option to pip:

$ /opt/intel/intelpython3/bin/pip install --user tensorflow-gpu

The output information generated during the installation process should look like:

Collecting tensorflow-gpu
  Downloading https://files.pythonhosted.org/packages/59/41/ba6ac9b63c5bfb90377784e29c4f4c478c74f53e020fa56237c939674f2d/tensorflow_gpu-1.8.0-cp36-cp36m-manylinux1_x86_64.whl (216.2MB)
    100% |████████████████████████████████| 216.3MB 7.8kB/s 
Collecting protobuf>=3.4.0 (from tensorflow-gpu)
  Downloading https://files.pythonhosted.org/packages/74/ad/ecd865eb1ba1ff7f6bd6bcb731a89d55bc0450ced8d457ed2d167c7b8d5f/protobuf-3.5.2.post1-cp36-cp36m-manylinux1_x86_64.whl (6.4MB)
    100% |████████████████████████████████| 6.4MB 266kB/s 
Collecting gast>=0.2.0 (from tensorflow-gpu)
  Downloading https://files.pythonhosted.org/packages/5c/78/ff794fcae2ce8aa6323e789d1f8b3b7765f601e7702726f430e814822b96/gast-0.2.0.tar.gz
Collecting termcolor>=1.1.0 (from tensorflow-gpu)
  Downloading https://files.pythonhosted.org/packages/8a/48/a76be51647d0eb9f10e2a4511bf3ffb8cc1e6b14e9e4fab46173aa79f981/termcolor-1.1.0.tar.gz
Requirement already satisfied: wheel>=0.26 in /opt/intel/intelpython3/lib/python3.6/site-packages (from tensorflow-gpu)
Collecting tensorboard<1.9.0,>=1.8.0 (from tensorflow-gpu)
  Downloading https://files.pythonhosted.org/packages/59/a6/0ae6092b7542cfedba6b2a1c9b8dceaf278238c39484f3ba03b03f07803c/tensorboard-1.8.0-py3-none-any.whl (3.1MB)
    100% |████████████████████████████████| 3.1MB 545kB/s 
Collecting grpcio>=1.8.6 (from tensorflow-gpu)
  Downloading https://files.pythonhosted.org/packages/c8/b8/00e703183b7ae5e02f161dafacdfa8edbd7234cb7434aef00f126a3a511e/grpcio-1.11.0-cp36-cp36m-manylinux1_x86_64.whl (8.8MB)
    100% |████████████████████████████████| 8.8MB 195kB/s 
Collecting astor>=0.6.0 (from tensorflow-gpu)
  Downloading https://files.pythonhosted.org/packages/b2/91/cc9805f1ff7b49f620136b3a7ca26f6a1be2ed424606804b0fbcf499f712/astor-0.6.2-py2.py3-none-any.whl
Requirement already satisfied: numpy>=1.13.3 in /opt/intel/intelpython3/lib/python3.6/site-packages (from tensorflow-gpu)
Requirement already satisfied: six>=1.10.0 in /opt/intel/intelpython3/lib/python3.6/site-packages (from tensorflow-gpu)
Collecting absl-py>=0.1.6 (from tensorflow-gpu)
  Downloading https://files.pythonhosted.org/packages/90/6b/ba04a9fe6aefa56adafa6b9e0557b959e423c49950527139cb8651b0480b/absl-py-0.2.0.tar.gz (82kB)
    100% |████████████████████████████████| 92kB 8.8MB/s 
Requirement already satisfied: setuptools in /opt/intel/intelpython3/lib/python3.6/site-packages (from protobuf>=3.4.0->tensorflow-gpu)
Requirement already satisfied: werkzeug>=0.11.10 in /opt/intel/intelpython3/lib/python3.6/site-packages (from tensorboard<1.9.0,>=1.8.0->tensorflow-gpu)
Collecting bleach==1.5.0 (from tensorboard<1.9.0,>=1.8.0->tensorflow-gpu)
  Downloading https://files.pythonhosted.org/packages/33/70/86c5fec937ea4964184d4d6c4f0b9551564f821e1c3575907639036d9b90/bleach-1.5.0-py2.py3-none-any.whl
Collecting markdown>=2.6.8 (from tensorboard<1.9.0,>=1.8.0->tensorflow-gpu)
  Downloading https://files.pythonhosted.org/packages/6d/7d/488b90f470b96531a3f5788cf12a93332f543dbab13c423a5e7ce96a0493/Markdown-2.6.11-py2.py3-none-any.whl (78kB)
    100% |████████████████████████████████| 81kB 8.9MB/s 
Collecting html5lib==0.9999999 (from tensorboard<1.9.0,>=1.8.0->tensorflow-gpu)
  Downloading https://files.pythonhosted.org/packages/ae/ae/bcb60402c60932b32dfaf19bb53870b29eda2cd17551ba5639219fb5ebf9/html5lib-0.9999999.tar.gz (889kB)
    100% |████████████████████████████████| 890kB 1.7MB/s 
Building wheels for collected packages: gast, termcolor, absl-py, html5lib
  Running setup.py bdist_wheel for gast ... done
  Stored in directory: /home/vesso/.cache/pip/wheels/9a/1f/0e/3cde98113222b853e98fc0a8e9924480a3e25f1b4008cedb4f
  Running setup.py bdist_wheel for termcolor ... done
  Stored in directory: /home/vesso/.cache/pip/wheels/7c/06/54/bc84598ba1daf8f970247f550b175aaaee85f68b4b0c5ab2c6
  Running setup.py bdist_wheel for absl-py ... done
  Stored in directory: /home/vesso/.cache/pip/wheels/23/35/1d/48c0a173ca38690dd8dfccfa47ffc750db48f8989ed898455c
  Running setup.py bdist_wheel for html5lib ... done
  Stored in directory: /home/vesso/.cache/pip/wheels/50/ae/f9/d2b189788efcf61d1ee0e36045476735c838898eef1cad6e29
Successfully built gast termcolor absl-py html5lib
Installing collected packages: protobuf, gast, termcolor, html5lib, bleach, markdown, tensorboard, grpcio, astor, absl-py, tensorflow-gpu
Successfully installed absl-py-0.2.0 astor-0.6.2 bleach-1.5.0 gast-0.2.0 grpcio-1.11.0 html5lib-0.9999999 markdown-2.6.11 protobuf-3.5.2.post1 tensorboard-1.8.0 tensorflow-gpu-1.8.0 termcolor-1.1.0

NOTE: The files brought by the tensorflow-gpu installation to the local file system will be located under ${HOME}/.local/lib/python3.6/site-packages/ directory!

That kind of test is very easy to perform. If it returns no error that means all symbols brought by the library libcudnn.so are known to the Python 3 interpreter.

To perform the test create the Python 3 script:

import ctypes

t=ctypes.cdll.LoadLibrary("libcudnn.so")

print(t._name)

save it as a file under the name cudnn_loading_cheker.py and then execute the script:

$ /opt/intel/intelpython3/bin/python3 cudnn_loading_cheker.py

If the libcudnn.so is successfully loaded the script will return the name of the library file:

libcudnn.so

and rise an error message otherwise.

Along with the other modules for scientific computing and data analysis, the Intel Python 3 package supplies Numba. To perform GPU computing based on CUDA, the Numba jit compiler requires the environmental variables NUMBAPRO_NVVM and NUMBAPRO_LIBDEVICE both properly declared before start compiling any Python code containing GPU instructions. Those variables should point to the installation tree of the latest version of CUDA:

$ export NUMBAPRO_NVVM=/usr/local/cuda-9.1/nvvm/lib64/libnvvm.so.3.2.0
$ export NUMBAPRO_LIBDEVICE=/usr/local/cuda-9.1/nvvm/libdevice

It is highly recommendable to declare these variables in ${HOME}/.bashrc file.

Once the variables are declared and loaded, execute the test script /opt/intel/intelpython3/lib/python3.6/site-packages/numba/cuda/tests/cudapy/test_matmul.py:

$ /opt/intel/intelpython3/bin/python3 /opt/intel/intelpython3/lib/python3.6/site-packages/numba/cuda/tests/cudapy/test_matmul.py

In case of successful execution the script will exit by displaying the message:

.
----------------------------------------------------------------------
Ran 1 test in 0.093s

OK

A simple script for testing tensorflow-gpu can be found here:

https://github.com/yaroslavvb/stuff/blob/master/matmul_benchmark.py

It should be downloaded and then executed by using Intel Python 3 interpreter:

$ /opt/intel/intelpython3/bin/python3 matmul_benchmark.py

and in case of successful execution the following result will appear on the screen:

/opt/intel/intelpython3/lib/python3.6/site-packages/h5py/__init__.py:34: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
  from ._conv import register_converters as _register_converters
2018-05-06 16:21:22.591713: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2018-05-06 16:21:22.684411: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:898] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2018-05-06 16:21:22.684824: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1356] Found device 0 with properties: 
name: Quadro K620 major: 5 minor: 0 memoryClockRate(GHz): 1.124
pciBusID: 0000:01:00.0
totalMemory: 1.95GiB freeMemory: 1.92GiB
2018-05-06 16:21:22.684855: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1435] Adding visible gpu devices: 0
2018-05-06 16:21:23.151861: I tensorflow/core/common_runtime/gpu/gpu_device.cc:923] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-05-06 16:21:23.151903: I tensorflow/core/common_runtime/gpu/gpu_device.cc:929]      0 
2018-05-06 16:21:23.151916: I tensorflow/core/common_runtime/gpu/gpu_device.cc:942] 0:   N 
2018-05-06 16:21:23.152061: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1053] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 1692 MB memory) -> physical GPU (device: 0, name: Quadro K620, pci bus id: 0000:01:00.0, compute capability: 5.0)

 8192 x 8192 matmul took: 1.34 sec, 817.99 G ops/sec


Compiling and installing GROMACS 2016 by using Intel C/C++ and Fortran compiler, and adding CUDA support


Content:

1. Introduction.

2. Setting the building environment.

3. Short notes on AVX-capable CPUs support available in GROMACS.

4. Downloading and installing CUDA.

5. Compiling and installing OpenMPI.

6. Compiling and installing GROMACS.

7. Invoking GROMACS.


GROMACS is an open source software for performing molecular dynamics sumulations. It also provides an excellent set of tools which can be used to analyze the results of the sumulations. GROMACS is fast and robust and its code supports a long range of run-time and compile-time optimizations. This document explains how to compile GROMACS 2016 on CentOS 7 and Scientific Linux 7 with CUDA support, by means of using Intel C/C++ and Fortran compiler.

Before starting with the compilation, be sure that you are awared of the following:

  • Do not use the latest version of GROMACS for production right after its official release, unless you are developer or just want to see what is new. Every software product based on such a huge amount of source code might contain some critical bugs at the beginning. Wait up for 1-2 weeks after the release date and then check carefully the GROMACS user-support forums. If you see no critical bugs reported (or minor bugs which might affect your simulations in particular) there you can compile the latest release of GROMACS. Even then test the build agains some known simulation results of yours. If you see no big differences (or see some expected ones) you can proceed with the implementation of the latest GROMACS release on your system for production.

  • If you administer HPC facility where the compute nodes are equipped with different processores, you most probably need to compile GROMACS code separately to match each CPU type features. To do so create build hosts for each CPU type by using nodes that matches that type. Compile there GROMACS and then clone the installation to the rest of the nodes of the same CPU type.

  • Always use the latest CUDA compatible to the particular GROMACS release (carefully check the GROMACS GPU documentation).

  • During the compilation of GROMACS always build its own FFTW library. That really boosts the productivity of GROMACS.

  • Compiling OpenMPI with Intel Compiler is not of critical importance (the system OpenMPI libraries provided by the Linux distributions could be employed instead), but it might rise up the productivity of the simulations. Also Intel C/C++ and Fortran compiler provides its native MPI support that could be used later, but having freshly compiled OpenMPI always helps you to be always up to date to the resent MPI development. Before starting the compilation be absolutely sure what libraries and compiler options do you need to successfully compile your custom OpenMPI and GROMACS!

  • Use the latest Intel C/C++ and Fortran Compiler if it is possible. That largerly guarantees that the specific processor flags of the GPU and CPU will be taken into account by the C/C++ and Fortran compilers during the compilation process.

 

2. Setting the building environment.

Before starting be sure you have your build folder created. You might need an unprivileged user to perform the compilation. Open this document to see how to do that:

https://vessokolev.blogspot.com/2016/08/speeding-up-your-scientific-python-code.html

See paragraphs 2, 3, and 4 there.

 

3. Short notes on AVX-capable CPUs support available in GROMACS.

If the result of the execution of cat /proc/cpuinfo shows avx2 CPU flag (coloured red bellow):

$ cat /proc/cpuinfo
...
processor : 23
vendor_id : GenuineIntel
cpu family : 6
model : 63
model name : Intel(R) Xeon(R) CPU E5-2670 v3 @ 2.30GHz
stepping : 2
microcode : 0x37
cpu MHz : 1221.156
cache size : 30720 KB
physical id : 0
siblings : 24
core id : 13
cpu cores : 12
apicid : 27
initial apicid : 27
fpu : yes
fpu_exception : yes
cpuid level : 15
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm ida arat epb pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid cqm xsaveopt cqm_llc cqm_occup_llc
bogomips : 4589.54
clflush size : 64
cache_alignment : 64
address sizes : 46 bits physical, 48 bits virtual
power management:

then your CPU supports Intel® Advanced Vector Extensions 2 (Intel® AVX2). GROMACS supports AVX2 and that feature busts significantly the performance of the sumulations when compute bonded interactions. More on that CPU architecture features here:

https://software.intel.com/en-us/articles/how-intel-avx2-improves-performance-on-server-applications

 

4. Downloading and installing CUDA.

You need to install CUDA rpm packages on both build host and compute nodes. The easiest and most efficient way to do so and get updates later (when there are any available) is through yum. To install the NVidia CUDA yum repository file visit:

https://developer.nvidia.com/cuda-downloads

and download the repository rpm file, as illustrated in the picture shown bellow:

Alternative way to get the repository rpm file is to browse the NVidia CUDA repository directory at:

http://developer.download.nvidia.com/compute/cuda/repos/rhel7/x86_64

scroll down there, and find and download the rpm file named "cuda-repo-rhel7-*" (select the recent one). Then install it locally by using yum localinstall

# yum localinstall /path/to/cuda-repo-rhel7-*.rpm

Once ready with the CUDA respository installation do become a root or super user and install the CUDA Toolkit rpm packages:

# yum install cuda

Note that the process of installation takes time, which mainly depends on both network connectivity and productivity of the local system. Also note that yum automatically installs (through dependencies in the rpm packages) DKMS to support the rebuilding of the NVidia kernel modules when booting a new kernel.

If you do not expect to use your compute nodes for compiling any code with CUDA support and only execute compiled binary code there, you might no need to install all rpm packages through the meta package "cuda" (as shown above). You could specify which of the packages you really need to install there (see the repository). To preview all packages provided by the NVidia CUDA repository execute:

# yum --disablerepo="*" --enablerepo="cuda" list available

Alternative way to preview all packages available in the repository "cuda" is to use the cached locally sqlite3 database of that repository:

# yum makecache
# HASH=`ls -p /var/cache/yum/x86_64/7/cuda/ | grep -v '/$' | grep primary.sqlite.bz2 | awk -F "-" '{print $1}'`
# cp /var/cache/yum/x86_64/7/cuda/${HASH} ~/tmp
# bunzip ${HASH}-primary.sqlite.bz2
# sqlite3 ${HASH}-primary.sqlite
sqlite> select name,version,arch,summary from packages;

 

5. Compiling and installing OpenMPI

Be sure you have the building environment set as explained before. The install the packages hwloc-devel and valgrind-devel:

# yum install hwloc-devel valgrind-devel

and finally proceed with the configuration, compilation, and installation:

$ cd /home/builder/compile
$ . ~/.intel_env
$ . /usr/local/appstack/.appstack_env
$ wget https://www.open-mpi.org/software/ompi/v2.0/downloads/openmpi-2.0.0.tar.bz2
$ tar jxvf openmpi-2.0.0.tar.bz2
$ cd openmpi-2.0.0
$ ./configure --prefix=/usr/local/appstack/openmpi-2.0.0 --enable-ipv6 --enable-mpi-fortran --enable-mpi-cxx --with-cuda --with-hwloc
$ gmake
$ gmake install
$ ln -s /usr/local/appstack/openmpi-2.0.0 /usr/local/appstack/openmpi
$ export PATH=/usr/local/appstack/openmpi/bin:$PATH
$ export LD_LIBRARY_PATH=/usr/local/appstack/openmpi/lib:$LD_LIBRARY_PATH

Do not forged to update the variables PATH and LD_LIBRARY_PATH by editting their values in the file /usr/local/appstack/.appstack_env. The OpenMPI installation thus compiled and installed provides your applications with more actual MPI tools and libraries than might be provided by the recent Intel C/C++/Fortran Compiler package.

 

6. Compiling and installing GROMACS

Be sure you have the building environment set as explained before and OpenMPI installed as shown above. Then proceed with GROMACS compilation and installation:

$ cd /home/builder/compile
$ wget ftp://ftp.gromacs.org/pub/gromacs/gromacs-2016.tar.gz
$ tar zxvf gromacs-2016.tar.gz
$ cd gromacs-2016
$ . ~/.intel_env
$ . /usr/local/appstack/.appstack_env
$ cmake . -DCMAKE_INSTALL_PREFIX=/usr/local/appstack/gromacs-2016 -DGMX_MPI=ON -DGMX_BUILD_OWN_FFTW=ON -DGMX_GPU=ON -DCUDA_TOOLKIT_ROOT_DIR=/usr/local/cuda -DMPI_C_LIBRARIES=/usr/local/appstack/openmpi/lib/libmpi.so -DMPI_C_INCLUDE_PATH=/usr/local/appstack/openmpi/include -DMPI_CXX_LIBRARIES=/usr/local/appstack/openmpi/lib/libmpi.so -DMPI_CXX_INCLUDE_PATH=/usr/local/appstack/openmpi/include
$ gmake
$ gmake install
$ export PATH=/usr/local/appstack/gromacs-2016/bin:$PATH
$ export LD_LIBRARY_PATH=/usr/local/appstack/gromacs-2016/lib64:$LD_LIBRARY_PATH

Do not forged to update the variables PATH and LD_LIBRARY_PATH by editting their values in the file /usr/local/appstack/.appstack_env.

 

7. Invoking GROMACS

To invoke GROMACS compiler and installed by following the instruction in this document you need to have the executable gmx_mpi (not gmx!!!) in your PATH environmental variable, as well as the path to libgromacs_mpi.so. You may set those paths by appending to your .bashrc the line:

. /usr/local/appstack/.appstack_env

You may execute as well the above as a command line only when you need to invoke gmx_mpi if you do not want to write this down to .bashrc.


Creative Commons - Attribution 2.5 Generic. Powered by Blogger.

Implementing LUKS Encryption on Software RAID Arrays with LVM2 Management

A comprehensive guide to partition-level encryption for maximum security ...

Search This Blog

Translate