32 H & PC Today -HouseholdandPersonalCareToday- vol.13(3)May/June2018 In table 1a are reported the most studied target against active natural compounds ranked by the number of activity values independently from the species. Cytochrome p450 family resulted to be the most studied, in particular type 3A4 and 1A2. In addition to the CytochromeP450 members, also Acetylcholinesterase, alpha-Glucosidase and Tyrosinase targets, showed a large amount of NCs which are active against: 1760, 1676 and 1277 respectively. Analysis of the Medium-confidence subset The analysis of the medium-confidence set showed 6575 natural unique compounds active against 3265 unique human targets. In particular, based on molecular weight with the highest number of natural active compounds ranged from 150 to 500Da. Analysis of the high-confidence subset From the three databases used in this study, a total of 4166 unique natural compounds with explicit IC50, EC50 and Ki values belong to the high-confidence level set. These compounds were active against 1346 unique human targets. In detail 3563, 1030 and 594 NCs were reported with explicit IC50, Ki and EC50, respectively. As can be deduced from the data reported before, NCs with explicit IC50 resulted to be the most represented, active against 1089 unique targets, while NCs with explicit Ki and EC50, where reported to be active against a lower number of unique targets, 218 and 439, respectively. The distribution of the active NCs by molecular weight showed the highest concentration of active molecules in the range 200-300Da independently from the measurement of potency considered. Between the NCs, the most active compounds were quercetol, resveratrol and curcumin (Table 2). Quercetol was reported active against 216 targets in the IC50 set, and 44 and 10 in the Ki and EC50 subsets, respectively. Regarding the targets, proteins belonging to the cytochromeP450 family were the most frequent in the IC50 and Ki sets, while the two type of estrogen receptors (alpha and beta) were the most abundant targets with explicit EC50 reported. DISCUSSION From the three different data repositories used in this study, active natural compounds were extracted and classified in three different confidence-levels sets (High, Medium and Low). Low-confidence set was characterized by all active NCs without any regard to the target species studied, the assay and the measurement of potency involved. Low-confidence set represent the “universal” set containing all the available information regarding NCs and their activities. Based on analysis of the low-confidence set clearly appear that activity against human targets is the most studied, accounting for the 33.8% of the total targets extracted followed by other two mammalian species: rattus norvegicus and mus musculus. Interesting, the total amount of active NCs extracted accounted for the 0.25% of all chemical structures described in PubChem Compounds database (one of the most complete repository dataset available), but in term of active compounds tested, the percentage increase to around 8.4% (PubChem Assay). So, even if the total amount of NCs is extremely lower than total chemical compounds described, the information relative to the biological activities is much higher. Moving from the low-confidence level set to medium (set containing NCs active against human target independently form measurements of potency reported) and high-confidence (set containing NCs active against human targets with explicit IC50, Ki and EC50 reported) level sets, the total number of active NCs decrease from more than 200000 to 6575 for the medium- level and 4162 for the high-level confidence sets. In particular, the high-confidence set contain 37% less amount of active NCs, evidencing the fact that the majority of the NCs tested in human targets are reported with high confidence measurement potencies values. For all the sets presented in this work, the distribution of the active compounds by molecular weight showed the high concentration of active NCs in the range between 300 and 400Da. A possible explanation for these findings is that small compounds and molecular fragments are easier to accommodate in differently shaped binding sites than larger ones, together with the higher probability to isolate compounds from natural sources with low MW(less than 500Da). With regard to high-confidence level set and in particular to the subset base on measurement of potency, 3568 NCs belong to the IC50 and Ki subsets, while only 594 to the EC50 set. Based on this data, we can state that majority of NCs possess inhibitory activity behavior. In particular, protein belonging to the cytochromeP450 family were the most targeted from NCs inhibitors. In the other side, both estrogen receptor classes (alpha and beta) were the most frequent targets for the action of NCs with agonistic activity. Table 1a. Active NCs based target distribution in the low-confidence level set. Table 2. Most frequent multitarget NCs in the high-confidence level set (compound vs number of targets). Table 1b. Most frequent multi-target NCs in the low-confidence level set.