Peak Picking in 4D Spectrum with POKY

Overview

The general workflow of this tutorial starts with referencing and converting the spectra to POKY/Sparky format. Then, we will create 2D projections from the 4D spectrum, which will help us reference it properly.

For peak picking, we will follow a systematic strategy that, although more involved, will allow for higher precision in identifying peaks in the 4D spectrum while minimizing noise. Since the 2D projections are derived directly from the 4D spectrum, they provide a more accurate reference than the HSQC spectra, whose peak centers may deviate slightly from those in the projections.

To ensure accurate peak selection, we will use the 2D projections as intermediate reference points for restricted peak picking in the 4D spectrum. The workflow is as follows:

  1. Overlay the projections onto the corresponding HSQC spectra
  2. Identify peak centers by using the HSQC spectra as references
  3. Use these peak centers to perform restricted peak picking in the 4D spectrum
  4. Unfold or unalias peaks as necessary
  5. Remove noise peaks from the 4D spectrum

By following this approach, we ensure that the final set of picked peaks in the 4D spectrum is as accurate and noise-free as possible.

Prerequisites

Steps


Step 1. Reference the HSQC Spectra in Topspin

Follow the instructions to reference the 1H-15N and 1H-13C HSQC in Topspin with BioTop.


Step 2. Convert and Prepare Spectra

2.1 Convert Spectra to UCSF Format

Enter the directory where each spectrum is saved in Bruker format and run bruk2ucsf from there—running it from another directory will fail.
For example, to convert the 1H-15N, 1H-13C HSQC spectra, and the 4D HCNH NOESY:

 bruk2ucsf_run 6/pdata/1/2rr /srv/NMR/Peak_Picking/Nanoluc/15N_HSQC.ucsf
 bruk2ucsf_run 7/pdata/1/2rr /srv/NMR/Peak_Picking/Nanoluc/13C_HSQC.ucsf
 bruk2ucsf_run 5/pdata/1/4rrr /srv/NMR/Peak_Picking/Nanoluc/4D_HCNH_NOESY.ucsf

Note: You can also convert the spectra from Bruker to UCSF format in POKY/Sparky, but you cannot rename the axes in that process.

2.2 Rename Axes

Rename the axes in the 1H-15N and 1H-13C HSQC spectra:

ucsfdata -a1 N -a2 HN 15N_HSQC.ucsf
ucsfdata -a1 C -a2 HC 13C_HSQC.ucsf

Print the axis values of the 4D HCNH NOESY:

ucsfdata 4D_HCNH_NOESY.ucsf

Example output:

axis                          w1          w2          w3          w4
nucleus                       1H         13C         15N          1H
matrix size                  256         256         256         416
block size                     8           8           8          13
upfield ppm                1.194       6.301     101.402       5.279
downfield ppm              8.208      73.001     133.002      10.622
spectrum width Hz       6666.667   15939.978    3043.445    5078.125
transmitter MHz          950.374     238.980      96.311     950.374

From the upfield and downfield rows, you can guess which axis is HC and which is HN. In this example, the following command renames them properly—amide protons have higher shift values than the aliphatic protons:

ucsfdata -a1 HC -a2 C -a3 N -a4 HN 4D_HCNH_NOESY.ucsf

IMPORTANT: Make sure that axes are named consistently in all spectra; otherwise, you will encounter problems during peak picking.

2.3 Create C-HC and N-HN Projections

For a detailed tutorial, see Create_2D_projections_from_4D_spectrum. Briefly, extract the N-HN projection from the 4D HCNH NOESY. You may need to adjust the -p[1-4] values according to your 4D spectrum dimension order:

ucsfdata -p1 -r -o C-N-HN.ucsf 4D_HCNH_NOESY.ucsf
ucsfdata -p1 -r -o 2D_N-HN_proj.ucsf C-N-HN.ucsf

Similarly, for the C-HC projection:

ucsfdata -p4 -r -o HC-C-N.ucsf 4D_HCNH_NOESY.ucsf
ucsfdata -p3 -r -o 2D_HC-C_proj.ucsf HC-C-N.ucsf

Step 3. Loading the Spectra

Load the spectra


Step 4. Adjusting the Spectra

Synchronize Spectra

Correct the contour levels and colors

Align the 2D_N-HN_proj to the 15N_HSQC

Align the 2D_HC-C_proj to the 13C_HSQC
Follow the same procedure described in the previous step.

Reference the 4D_HCNH_NOESY


Step 5. Peak Picking

5.1 Adjusting Contour Levels and Preparing Reference Peaks

5.2 Optimizing Restricted Peak Picking for Higher Accuracy

For more accurate restricted peak picking, use the “Nucleate Grid” plugin in POKY. This tool helps generate artificial peaks in the form of a grid lattice, seeded from original peaks in your 2D reference spectra—in this context, from the 15N-HSQC and 13C-HSQC spectra.

How to install the “Nucleate grid” POKY plugin: copy the file nucleategrid.py to poky_linux/modules/poky and inside poky_linux/modules/poky/poky_site.py add the following line under ('restrictedpick','show_dialog')),: ` (‘Ng’, ‘Nucleate grid’, (‘nucleategrid’, ‘show_dialog’)),`.

The grid is generated starting from these reference (or “seed”) peaks and is expanded until an elliptical shape is formed, as defined by the parameters r1 and r2. You can also add spacing between grid points using the w1_step and w2_step padding parameters. To select the optimal r1 and r2 values for your spectra, focus on a few representative peaks and hit ms to measure their radius in both axes.

calculate w1_step and w2_step

Once the artificial peaks are created, press lt to select and delete the low-intensity ones. This step refines the grid so that it covers only regions in the 2D spectra with high signal-to-noise ratios, as shown in the figures below.

refined lattice grid 15N-HSQC

refined lattice grid 13C-HSQC

You can then use all these peaks—including the real seed peaks and the newly generated artificial ones—for restricted peak picking. This helps capture more peaks in the target 4D NOESY spectrum, which is beneficial because 4D-GraFID can automatically handle noisy peaks and refine the selection using heuristic criteria. It’s better to capture more peaks at this stage to avoid missing real, informative inter-residual peaks that help build the NH-map. 4D-GraFID will remove the noise peaks automatically later.


5.3 Pick peaks in the 1H-13C HSQC

Repeat the same peak picking procedure for 1H-13C HSQC and create grid points for restricted peak picking.

5.4 Perform Restricted Peak Picking using 15N HSQC and 1H-13C HSQC as reference

Since this protein is large, we will perform restricted peak picking in two rounds.
This is necessary because the screen updates every time a picking cycle completes, and for large proteins, this eventually becomes terribly slow.

⚠️ Important: Do not delete any peaks yet, as we did in the first round—otherwise, you will lose some of the peaks identified earlier.

💡 For large proteins with tens of thousands of peaks, it is recommended to delete them in two batches rather than all at once.

This is how the final peak selection on the HC-C plane of the 4D NOESY (overlaid on the 13C-HSQC for clarity) should look.

4D peaks HC-C plane

Step 6. Unalias/Unfold 4D Peaks (if necessary)

Next, we will perform unaliasing/unfolding of peaks - if there are any. For more details, please read the respective article.

Aliased peaks usually occur in the ranges C < 25 ppm and HC > 3 ppm.

For demonstration, I show you the 4D spectrum of another protein, which has many aliased peaks that appear on top.

aliased 4D


Step 7. Manual Refinement of 4D Peak List

Discard the 4D noise peaks using an S/N cutoff

low intensity 4D peaks

Discard the noise peaks using the 13C-HSQC and the 15N-HSQC

Next, we will manually inspect all the 4D peaks and remove those that are not located in density regions—neither of the 13C-HSQC nor of the 15N-HSQC.

You should now have two different views of the 4D spectrum:

  1. One showing the selected peaks on the HC-C plane
  2. The other showing the selected peaks on the N-HN plane

What you must do next is manually inspect the peaks and delete those not in density regions. This requires a bit of intuition and a sharp eye. Unfortunately, it cannot be automated—it must be supervised manually by pressing st. Below are shown two obvious noise peaks on the HC-C plane view.

manually selected noise peaks


Step 8. Exporting Peak Lists for 4D-GraFID

8.1 Export Picked 4D Peaks

Go to the 4D peak list (type lt) and select the columns w1, w2, w3, w4, Data Height. Click Apply, then Save….

8.2 Enhance the 13C-HSQC peak list with multiplicity information (which C–H is methylene)

We follow this approach because the 13C ME-HSQC spectrum is very noisy, with large dispersion effects, meaning that the peak centers deviate from those identified in standard 13C-HSQC spectrum, which provides the maximum possible resolution and S/N. As such, we end with a near-complete aliphatic C-H peak list with information on whether a peak corresponds to a methylene group, which improves both accuracy and coverage for chemical shift assignment in 4D-GraFID.

8.3 Enhance the 15N-HSQC peak list with multiplicity information (which peak comes from N–H and which from N–H2)

Apply the previous trick but on the standard 15N-HSQC peak list using the 15N ME-HSQC spectrum. Export the new peak list to a file including the intensity column of 15N ME-HSQC, which tells which peaks originate from an N-H and which from an N-H2 group. This information is used by 4D-GraFID to identify the side-chain amide peaks.


Notes for Special Cases

Unaliasing Peaks in POKY
When you do restricted peak picking (kr) using as a reference peaks that have not been unaliased or unfolded, POKY will automatically check for possible aliased peaks. If the spectrum width of the reference 2D is larger than that of the target 3D/4D spectrum, POKY will find and mark the peaks in the 3D/4D as aliased.

However, BEWARE that when your reference peaks are unaliased or unfolded, POKY won’t match the correct peaks in the target spectrum unless they are also unalias/unfolded. It may catch some peaks but they will be irrelevant. Therefore, do not unalias/unfold the peaks in the 2D HC-C and N-HN projections! Do the unaliasing/unfolding directly on the 4D HCNH NOESY.

Below are examples of the 13C-HSQC spectra with aliased peaks (in yellow boxes):

Protein 1 Example 13C-HSQC - Protein 2
13C-HSQC-ac1 13C-HSQC-sy15

Authors