Original Article
Pubblicato: 2022-05-20

Profiling exhaled breath of smokers using mass spectrometry to identify a signature related to tobacco use for diagnostic perspectives

S.S.D. Pneumologia, Fondazione IRCCS Istituto Nazionale dei Tumori di Milano
S.S.D. Bersagli Molecolari, Fondazione IRCCS Istituto Nazionale dei Tumori di Milano
S.S.D. Bersagli Molecolari, Fondazione IRCCS Istituto Nazionale dei Tumori di Milano
S.S.D. Pneumologia, Fondazione IRCCS Istituto Nazionale dei Tumori di Milano
S.S.D. Bersagli Molecolari, Fondazione IRCCS Istituto Nazionale dei Tumori di Milano
tobacco smoke COPD mass spectrometry breath analysis


Breath analysis for the identification of volatile organic compounds by mass spectrometry is a very innovative and non-invasive technology, which represents a great opportunity for an early and personalised diagnosis. In this pilot study we recruited a series of volunteers, smokers and non-smokers, characterized from the respiratory point of view, and profiled their exhaled breath through SESI- HRMS technology. The aim of the study is to identify a volatile molecular signature associated with tobacco use. The supervised analysis highlighted 32 features that discriminate the breath of smokers and non-smoker subjects, at the baseline. We therefore identified a volatile molecular signature closely related to tobacco smoke, which will be characterized in subsequent studies.


Cigarette smoking is the main risk factor for the onset and exacerbation of inflammatory diseases of respiratory system, such as chronic obstructive pulmonary disease (COPD) and asthma, and of oncological diseases, such as lung tumours. Tobacco consumption has a dual role in generating the characteristic inflammatory state: direct, through the oxidizing agents it contains, such as nitric oxide (NO) and free radicals, and indirectly through the stimulation of inflammatory cells, which once activated, produce reactive oxygen substances (ROS) capable of contributing to the mechanism of inflammation and tissue damage. In addition, among the many damages caused by smoking we include, in the first instance, those on the respiratory epithelium, the inactivation of antiproteases, mucosal hypersecretion, increased sequestration of neutrophils in the pulmonary microcirculation and the gene expression of pro-inflammatory mediators. The compounds present in tobacco smoke cause intense bronchial inflammation and the persistence of the smoking habit leads to cumulative damage that facilitates the remodelling of the airways, as in the case of COPD and asthma [1]. To date, the diagnosis of these two chronic inflammatory diseases is supported by carrying out of a spirometry (examination for the evaluation of lung function) in a patient who has a history consistent with the suspected diagnosis. However, spirometry confirms the presence and defines the degree of bronchial obstruction, but does not provide information on the degree of inflammation, especially in the early stages of the disease. Breath analysis for the search of volatile organic compounds (VOCs) is a very innovative technology, which represents a great opportunity to redesign clinical diagnostics in a context of early and personalized diagnosis. It is a totally non-invasive procedure, with the possibility of unlimited sampling, reproducible and easy to perform, very well accepted even by fragile patients, with breathing difficulties, especially for respiratory diseases such as asthma, COPD and lung cancer [2]. Concentrations of VOC detectable in human exhalation can be altered by inflammatory processes, in particular due to the oxidation of lipids and the action of radical species [3]. In recent years, preliminary studies based on the analysis of the exhalation of COPD patients have been reported to assess the level of the inflammatory state and the risk of exacerbation [4, 5]. Fens also suggested the association between some exhaled VOCs typically involved in lipid peroxidation and the number of eosinophils and neutrophils, which promote the formation of ROS.

Objective of the study

In this pilot study we used a very sensitive analytical technology based on mass spectrometry (MS) and designed for clinical practice, in order to analyze the breath of healthy volunteers. The study involved the recruitment of a series of volunteers, smokers and non-smokers, well characterized in terms of respiratory function, levels of exhaled carbon monoxide (CO), nitric oxide (FeNO) and smoking habits. Their exhalation was analyzed using mass spectrometry in order to:

  1. identify the differences in terms of volatile organic compounds (VOCs) present in the exhalation of healthy smoking volunteers compared to the breath of healthy non-smoking subjects;
  2. identify and characterize a volatile molecular signature associated with chronic tobacco use;
  3. identify and characterize a volatile molecular signature relating to the breath collected immediately after the consumption of a cigarette.


Study design

The study carried out at the IRCCS National Cancer Institute (INT) Foundation in Milan and approved by the Ethics Committee (INT 16/17 amendment 4), envisaged the recruitment of 45 healthy male volunteers, of which 22 smokers and 23 non-smokers, from December 2020 to March 2021. To avoid the introduction of confounding factors related to sex-related variables, it was opted for the recruitment of men only. Furthermore, to ensure that the lifestyle did not lead to the detection of substances capable of interfering with the instrumental analysis, in the 2 hours prior to the test it was not possible:

  1. to smoke;
  2. to eat food or candy;
  3. to drink (excluding water);
  4. use cocoa butter;
  5. take medications (if necessary, take them and inform the staff).

In the first phase, the study participants, after completing the appropriate informed consent, a questionnaire that ascertained the absence of symptoms attributable to COVID-19 and an anamnestic and lifestyle questionnaire, were invited to perform a spirometry, the FeNO measurement to assess bronchial inflammation and CO measurement to quantify exposure to tobacco smoke. All smokers were offered the access to our anti-smoking centre for smoking cessation. In the second phase, the study participants were conducted in an environment dedicated to exhalation sampling, equipped with an air purification system. The collection of breath samples took place by inflating 2-liter nalophan balloons, self-produced and sterilized with hydrogen peroxide vapours, with a single deep exhalation. At the same time, the room air was also collected to assess the presence of environmental interferers. Study participants were asked to fill two balloons with their exhalation. After a 20-minute break, during which smoking patients were asked to smoke only one cigarette (the only procedural difference between the two groups), the exhalation was collected again by filling another 2 balloons, for a total of 4 balloons for each volunteer. Breath collection was carried out by adopting the most up-to-date provisions to combat the spread of the COVID-19 pandemic: in particular, the balloons were equipped with a mouthpiece with a filter capable of retaining bacteria / viruses but which still allowed the passage of VOCs without alter its composition. The operators also made use of all the PPE prescribed in a hospital environment for the containment of the pandemic. The samples, conducted in the laboratory, were analyzed within 120 minutes of collection to avoid their deterioration. Sampling and subsequent instrumental analysis took place in accordance with standardized operating procedures (SOPs) previously validated by other INT studies. Any non-compliance with them was recorded and the non-compliant samples relating to 2 subjects were collected and analyzed again.

Analysis of exhaled samples

The analysis of the exhaled was carried out with SESI-HRMS technology (secondary electrospray ionization-high resolution mass spectrometry) for the detection of volatile organic compounds present in the human breath, essentially as described by Martinez-Lozano Sinues [7]. The extraction of the spectral data and their conversion to numerical data was carried out using the Mzmine software [8] and the subsequent statistical analysis of the data took place through our pre-processing and supervised analysis procedure using the R [7] software. The final result is represented by a signature, that is a list of features differentially present in the analyzed groups, attributable to peaks detected in MS and therefore to the elementary compositions of the VOCs and their relative intensity values.


Sample description

Were enrolled 45 healthy male volunteers, aged 24 to 67, 23 non-smokers and 22 smokers. The average age of smokers and non-smokers is 42 for both groups. The data collection relating to the smoking history of 22 smokers led to the following average values: 13 cig / day, 23 years of smoking, 16 P / Y.

The analysis of CO, FeNO and spirometry (Figure 1) shows:

  1. the validity of the CO levels that confirm the smoking status declared by the volunteers (Fig. 1 a);
  2. a clear relationship between daily cigarette consumption and CO values. In 4 subjects we found an altered spirometry (obstructive syndrome) and therefore they were sent to the pulmonologist for investigations. All four altered spirometries (value 1) were associated with smokers of ≥ 20 cigarettes / day (Fig. 1 b);
  3. FeNO values divided between smokers and non-smokers confirm that this inflammatory indicator is lower in smokers. This consideration was already known in the literature, even if with a mechanism still not fully understood [9] (Fig. 1 c);
  4. no correlations are observed between cig / day number and FeNO values (Fig. 1 d).

Breath analysis

The data relating to the breath samples sampled both at baseline and after 20 minutes (4 samples for each subject) generated a dataset of 19,896 features for 180 samples that were subjected in the pre-processing phase to quality controls, normalization and data filtration. The final dataset contains 2,332 features for 168 samples. The supervised analysis highlighted 32 and 184 features that discriminate the breathing of the 2 groups at the baseline and the second sampling, respectively. Five most significant features were identified at the baseline yet to be characterized. It should be noted that the differences were found between non-smokers and smokers who had been abstaining from smoking for at least two hours. These features could therefore be traced back to chronic tobacco use. On the other hand, some features characteristics of smokers only emerge from the analysis of the exhalation collected immediately after tobacco consumption. We are therefore also able to detect signals deriving from the direct and immediate use of the cigarette.

Conclusions and future clinical implications

In the first phase of the present study we confirmed the expected results measured in terms of CO (correlation with the cigarettes consumed) and FeNO (lower in smokers). As for the second phase, by comparing the breath prints of smokers and non-smokers obtained through breath analysis, we detected molecular differences and identified features significantly related to chronic tobacco use and direct and immediate exposure to cigarette smoke. We can therefore say that we have identified a molecular signature closely related to tobacco smoke, which will have to be characterized in subsequent studies.

This pilot study opens up to numerous design scenarios, with important clinical implications:

  1. follow-up and sampling of the 22 smokers, whether they have quit smoking (to date, 2 subjects maintain their abstention after being sent to our Anti-Smoking Center) or have continued to smoke, to evaluate the performance of the features considered to be characteristic;
  2. validation of the signature through a study with a larger sample size for the characterization of the VOCs present in the exhaled and their contextualization in human metabolic pathways related to tobacco consumption, in order to obtain a greater understanding of the inflammatory and biochemical mechanisms underlying the damage from smoke, not yet fully clarified;
  3. the existence of a smoking-associated profile will make it possible to set up larger studies including smokers with inflammatory diseases at different stages, which will therefore make it possible to associate this profile with smoke-related inflammatory diseases in the perspective of an early, reproducible, sustainable diagnosis and completely non-invasive;
  4. Finally, the information obtained from this study will make it possible to identify the presence of smoke-related confounders in clinical studies involving the search for biomarkers associated with cancer in breath.

Figures and tables

Figure 1.Measured values of CO and FeNO and the outcome of the subjects' spirometry.


  1. Melillo E, Melillo G. Fumo di tabacco e stress ossidativo respiratorio. Tabaccologia. 2004; II(1):15-9.
  2. Ibrahim W, Carr L, Cordell R, Wilde MJ, Salman D, Monks PS. Breathomics for the clinician: the use of volatile organic compounds in respiratory diseases. Thorax. 2021; 76:514-21. DOI
  3. Arterbery VE, Pryor WA, Sehnert SS, Foster WM, Abrams RA, Williams JR. Breath ethane generation during clinical total body irradiation as a marker of oxygen-free-radical-mediated lipid peroxidation: a case study. Free Radic Biol Med. 1994; 17:569-76. DOI
  4. Martinez-Lozano Sinues P, Meier L, Berchtold C, Ivanov M, Sievi N, Camen G. Breath analysis in real time by mass spectrometry in chronic obstructive pulmonary disease. Respiration. 2014; 87:301-10. DOI
  5. Gaugg MT, Nussbaumer-Ochsner Y, Bregy L, Engler A, Stebler N, Gaisl T. Real-time breath analysis reveals specific metabolic signatures of COPD exacerbations. Chest. 2019; 156:269-76. DOI
  6. Fens N, de Nijs SB, Peters S, Dekker T, Knobel HH, Vink TJ. Exhale air molecular profiling in relation to inflammatory subtype and activity in COPD. Eur Respir J. 2011; 38:1301-9. DOI
  7. Martinez-Lozano Sinues P, Landoni E, Miceli R, Dibari VF, Dugo M, Agresti R. Secondary electrospray ionization-mass spectrometry and a novel statistical bioinformatic approach identifies a cancer-related profile in exhaled breath of breast cancer patients: a pilot study. J Breath Res. 2015; 9:031001. DOI
  8. Pluska T, Castillo S, Villar-Briones A, Oresic M. MZmine 2: modular framework for processing, visualizing, and analyzing mass spectrometry-based molecular profile data. BMC Bioinformatics. 2010; 11:395. DOI
  9. Högman M, Janson C, Lisspers K, Bröms K, Ställberg B, Malinovsc A. Determinants of FENO in COPD with regard to current smoking. Eur Respir J. 2018; 52:PA2019. DOI


Chiara Veronese

S.S.D. Pneumologia, Fondazione IRCCS Istituto Nazionale dei Tumori di Milano

Francesco Segrado

S.S.D. Bersagli Molecolari, Fondazione IRCCS Istituto Nazionale dei Tumori di Milano

Riccardo Caldarella

S.S.D. Bersagli Molecolari, Fondazione IRCCS Istituto Nazionale dei Tumori di Milano

Roberto Boffi

S.S.D. Pneumologia, Fondazione IRCCS Istituto Nazionale dei Tumori di Milano

Rosaria Orlandi

S.S.D. Bersagli Molecolari, Fondazione IRCCS Istituto Nazionale dei Tumori di Milano


© Sintex Servizi S.r.l. , 2022

  • Abstract visualizzazioni - 103 volte
  • PDF downloaded - 18 volte