The challenge of classifying cannabis types is an issue regulated markets around world that sell legal cannabis must strive to resolve sooner rather than later. It behooves regulators in states like California, Massachusetts and Nevada that mandate cannabis lab testing, to push for a standardized naming system in an effort to provide complete transparency to their respective consuming public. In the following video, the chief science officer at VSSL in Kelowna B.C. Canada, Dr. Philippe Henry, who is working in conjunction with the Digipath Labs chief science officer, Dr. Cindy Orser, discusses a common denominator between cannabis types. Terpenes such as beta-myrcene, terpinolene, limonene and beta-caryophyllene are found in abundance in both marijuana and hemp plants and offer some clues to how we can properly classify cannabis varieties.
Testing and classification of traditional agricultural commodities, such as cotton, use a consistent methodology that have been implemented by companies like LGC for a long time. It only makes sense to reference those methodologies when striving to discover a naming convention for cannabis. Dr. Henry begins by thanking LGC and Digipath Labs for helping with his research in Canada where adult-use marijuana is legal. Just like cannabis lab testing in Nevada is mandated, so is it mandated in Canada. Dr. Henry humorously refers to the current breeding efforts of cannabis as being very creative.
Digipath Labs provided Dr. Philippe Henry with a great deal of the chemotyping data (chemotype: a chemically distinct entity in a plant, also called a chemovar, used to differentiate plant varieties) and phenotyping data (phenotyping: using genetic analysis to predict trait characteristics in a mature plant) he has used in his research on classifying cannabis. Dr. Henry discusses a little history on cannabis and how it has been typically broken down into two different species, but that while he views cannabis in a more dynamic fashion than most species concepts, he does view cannabis as a single species.
“The gene pools of cannabis are much influenced by humanity,” Dr. Henry explains. He continues by explaining some of the differences between sativa vs indica vs ruderalis. “Essentially, we have four main gene pools… European hemp, northern leaf hemp (NLH), broad leaf and narrow leaf drug types (BLD & NLD respectively) which are often referred to as indica and sativa.” The broad leaf and narrow leaf drug type plants are the commercial varieties which marijuana lab testing in Las Vegas are typically testing before sale. Both of these drug type plants are hybrids of one another.
Broad leaf hemp (BLH) or the Chinese hemp is not discussed as much. It is a non-psychoactive fiber type plant which Dr. Henry has never had a chance to run a sample on. All of these types of cannabis have been spread from Europe and Asia across the globe by humanity. What we refer to as marijuana is most likely a hybrid of what scientists refer to as the drug type varieties. Most recently, there has been a push for less psychoactive varieties that express CBD as opposed to THC. He explains that the CBD rich component of cannabis likely came from the European hemp varieties.
The spread of cannabis around the country by people has likely occurred due to certain characteristics of these cannabis types, such as fibrous material, hemp nut production or resin production. Characteristics that show the potential for large amounts of resin productions are a large density of glandular trichomes which produce the secondary metabolites that produce the recreational and medicinal properties found in cannabis. The trichomes found in drug type cannabis varieties are 1.6 times larger than they are on the fibrous varieties of cannabis.
“The topic of my talk is about chemovars which are a means to classify [cannabis] varieties by chemical expression,” Dr. Henry explains. “It is interesting that cannabis expresses cannabinoids as well as terpenoids. Cannabinoids have no scent or smell, but often when we think of cannabis we have a particular smell in mind which is most likely caused by the terpene expression of the plants. What this graph shows is work done with Digipath Labs which right now the data is set at about 9,000 entries that are chemotypes at 11 cannabinoids and 19 terpenoids. What is very interesting is that because a very large focus has been placed on cannabinoid expression, it is not a valid means to classify varieties based on where they originated from because it is that one trait that they were selected, [not for where they came from.] There were very strong breeding efforts for introgressing cannabinoids into different lines.
“We chose to get rid of cannbinoids even though they are shown here on this map. What you can see on the principal component analysis of about 5,000 data points and what we were interested on seeing was how they clustered based on the chemotific information. What we found is that there are three major groups and these are drug type cannabis, so high resin producing medicinal commercial varieties. The first group is dominated by beta-myrcene, which myrcene is the most common terpene found in cannabis and it is also the main constituent of the essential oils of fiber hemp. [The graph shows] another group which is dominated by terpinolene and a third group where the dominant terpene is limonene and beta-caryophyllene.” The arrows on the graph show how all of these groups cluster and the different colors correspond to each of those three groups.”
Dr. Henry continues, “This is where LGC comes in, in terms of moving from very large panels genotyping by sequencing. We took 70 accessions that were genotyped at about 180,000 single nucleotide polymorphisms (SNP). When those were filtered, we came out with a panel of about 1,409 high quality SNPs. The idea was to constrain the analysis. In this case here, a principal component analysis (PCA) that is constrained based on the chemotype profiles that I showed you earlier. Based on the different expression of beta-myrcene, limonene and terpinolene, they translate well based on these 1,400 SNPs. What you can do with this analysis is you can extract the loading on which particular SNP is driving the structure you are observing.
“We selected 48 of these 1,400 SNPS that were the most informative based on the statistics and we sent our data to LGC in Massachusetts where they screened our population and additional individuals. This was based on the cannabis genome that was published in 2011, in November of last year we had two new versions of the cannabis genome that were published which were much more extensive where the linkage groups segregated to ten chromosomes with much higher coverage. In this case here, linkage groups were something like 135,000 configs, so we don’t have a complete picture of this but in any case we decided to design KASP graphs and see how they work out. Nine out of the 48 failed to provide any amplification with this technique. The reason for that being a very scattered genome that we used which was seven years old at the time and has been recently replaced. We found that there were some discrepancies between the genotyping by sequencing approach, so in 16 SNPS that were shown to be polymorphic based on genotyping by sequencing approaches, were actually monomorphic even when we looked at European hemp versus medicinal cannabis.
“We came out with 23 polymorphic markers so a little less than 50% of these markers came out as validated. 18 of them were associated with this terpene model and an additional three were unaccustomed markers that we added in because we were interested in THC expressions, one SNP for seed production and one SNP associated with the mitochondria which we used to tease apart these large groups between each other.” Dr. Henry added.
“Here is a comparison to show the performance of these SNPs and how they work when you are trying to assign these ascensions back to their chemovar cluster. In this case there is one individual that is cannabis assigned, 50% to another group. 69 of the other individuals with genotyping at 18 SNPs were assigned to the right group. It is being mirrored in other jurisdictions, so here is a study that was done in New Mexico where they show that similar myrcene, terpinolene and limonene clustering. Here is an example of a data set that is still a work in progress. We took this panel and genotyped a large pool from Canadian licensed producers. You can see in pink in the top left, European hemp, that clusters together, even in a totally independent data set. So, this is British Columbia as opposed to Nevada where the initial SNP markers were developed. We still see the same clustering when you go back to these three major groups, myrcene, limonene and terpinolene expressed in cannabis. You see the little green cloud that refers to a variety that clusters with the myrcene dominant varieties but actually has a-pinene as its main terpene. This was confirmed in New Mexico where pinene and myrcene are always found associated with each other.
“Something quite interesting with these 18 markers is that even though it is a very small number of markers, it’s a cheap technology that we can implement in-house and often in the field. It offers essentially an individual barcode for every variety anybody grows. It enables us to have some transparency in terms of the origin of different variants. When we talk about products going from a producer to a retailer, we have the ability to track these varieties. It is a very simple system as opposed to using more advanced sequencing approaches.
“Here is another example on how these markers are clustering based on the position of each individual, so you are able to find a particular allele associated with, for example, hemp. If you look on the top left, we have a THC synthase allele that is always homozygous in hemp and is in all drug type cannabis. It is also a mitochondrial marker. That enables identification of not only whether the plant is going to express CBD or THC, but it is also a very targeted test where you are using only two SNPs to tease apart CBD expressing fiber types versus CBD expressing drug types. This can be assessed in vegetative varieties so you do not have to wait the full 12 weeks it takes to grow the plant and then flower it to have the expression of cannabinoids in the flower. You can do it in the seedling stage. When we are talking about accelerated breeding…. or when we are trying to introgress a particular part of the genome into a commercial variety for example. You are able to do this with this type of approach. Recently it was published that both CBD and THC synthase genes are found in chromosome 9 but there are others on another chromosome that affect the quantity of cannabinoids being expressed. With that information at hand and targeted markers that enable us to decipher that information, I think we will see a lot of novel varieties emerge where we have targeted introgression of particular traits such as minor cannabinoid expression. As an example, a wild or semi-wild population [of hemp] and introgress that particular part of the genome into a commercial variety so that we maintain all of the traits such as all of the agronomic traits of interest. It’s a very exciting time.
“Another side note, here we are looking more at hemp varieties as opposed to drug type cannabis. What is interesting about this particular variety here, which is called X59 or the Hemp Nut, and where we get a lot of the hemp seeds in Canada. It is a certified variety and if you go into a field of X59 you will notice several different phenotypes. In this case here, not only are these plants different sizes, we have characterized them as tall, medium and short. You will notice a large difference in fragrance based on the different phenotypes. The shorter plants are much more fragrant and also exhibit a lot more medicinal trichome production than what you see on the larger plants. Being able to take a variety like that and from seed be able to predict which phenotype it is going to fall into… it offers lots of promise in terms of being able to have more consistent production and more targeted production depending on the philosophy of the cultivator.
Dr. Henry concluded by a addressing a few questions and talking about the different forms of verification. “I mentioned a variety of verifications. I think the last one I saw was 1.4 million SNPs from a Canadian company. Here I show that with 18 SNPs you are actually able to tell varieties apart. So, making whole genome sequencing or doing some type of sequencing is not the most parsimonious tool to use for these purposes. One of the great things is that this is something that can be done in-house at relatively low costs and also I think very efficiently given that you do not have to wait several weeks to get results back, you can get it in the same day.”
He addressed a quick question concerning dominant levels of a terpene in a cannabis strain. “When we are talking about a particular terpene being dominant in a certain variety, the number is typically 0.5%, but very often you will see that some have more than 1%. But, 0.5% is really the cutoff.”
The science underlying this article is so ahead of its time I am not sure readers can grasp how powerful it is.