Using the iNaturalist API to detect novel records

How many never-before recorded species did we have in Uruguay in 2023?

Note I updated this post because I found out an easier way of doing it. You can find the previous version here.

This time, I’m interested in detecting if any of the records uploaded to iNaturalist in Uruguay during the past year, 2023, belong to new species for NaturalistaUY (our national site) or for the iNaturalist global platform. To answer this, I once again explored the iNaturalist API. If you want to know more about my first experiment with the API, see my previous post How many users in NaturalistaUY are Uruguayan?.

So, I want to know if there is an observation that records:

  • a new species for iNaturalist, i.e., a species recorded in Uruguay in 2023 that has no previous records in the platform.
  • a new species for NaturalistaUY: i.e., a species recorded in Uruguay in 2023 that has no previous records in Uruguay.

Luckily, the API has a call for observations/species_counts which can take a species or a list of species ids (taxon_id) and a location (place_id), and return, among other things, the number of observations the species has for that place (count) and also the number of observations the species has globally (observations_count).

So, let’s create the function getTaxonCount() which takes a taxon_id (or a list) and a place_id as arguments, and returns the taxon_name, taxon_rank (e.g., species, genus, or family), observations_place (number of records for the species at the place_id), and observations_iNat (number of records for the species globally for iNat). The place_id of Uruguays is 7259.

library(httr)
library(jsonlite)
library(knitr)
library(tidyverse)

getTaxonCount <- function(taxon_id, place_id = 7259){

  taxonCount <- tibble(taxon_id = numeric(),
                       taxon_name = character(),
                       taxon_rank = character(),
                       observations_place = numeric(),
                       observations_iNat = numeric())

  num_results = 1 # used to put the API to sleep and print on the console the num

  for (taxon_id_i in taxon_id) {  

    if ((num_results %% 10) + 10 == 10) {
      Sys.sleep(10) # every 10 calls, the code stops for 10
    }

    call_url <- str_glue('https://api.inaturalist.org/v1/observations/species_counts/?',
                         'verifiable=true&',
                         'place_id={place_id}&',
                         'taxon_id={taxon_id_i}')

    get_json_call <- GET(url = call_url) %>%
      content(as = "text") %>% fromJSON(flatten = TRUE)
    results <- as_tibble(get_json_call$results)

    if(get_json_call$total_results == 1) {

      taxonCount_i <- tibble(taxon_id = results$taxon.id,
                             taxon_name = results$taxon.name,
                             taxon_rank = results$taxon.rank,
                             observations_place = results$count,
                             observations_iNat = results$taxon.observations_count)
      taxonCount <- rbind(taxonCount, taxonCount_i)
      cat(num_results-1, 'taxon:', taxonCount_i$taxon_name,
          taxonCount_i$observations_iNat, 'observations on iNat\n')
    } else {
      cat(num_results, taxon_id_i,  'no info\n')
    }
    num_results <- num_results + 1
  }
  return(taxonCount)
}

Warning: If the taxon_id belongs to a family, order or other higher taxonomic ranks, the search may return many taxa (all those listed for that taxon_id). To avoid this, the function getTaxonCount() will not return data for such taxon_ids.

To use the function I downloaded all the observations recorded in Uruguay on 2023 from here: naturalista.uy/observations/export, using the following URL query: quality_grade=any&identifications=any&place_id=7259&verifiable=true&d1=2023-01-01&d2=2023-12-31.

Once downloaded the data, I read the file.

observations_2023 <- read_csv('datos/observations-2023-UY.csv',
                              guess_max = 30000)

And then I got a unique list of taxon IDs (taxa_list).

taxa_list <- observations_2023 %>%
  filter(!is.na(taxon_id)) %>%
  distinct(taxon_id) %>% pull(taxon_id)

Finally, I run the function with the taxon_list I extracted as the argument of the function getTaxonCount().

taxon_count_observations_2023 <- getTaxonCount(taxon_id = taxa_list,
                                              place_id = 7259)
1 121850 no info
1 taxon: Xiruana 279 observations on iNat
2 taxon: Excirolana armata 16 observations on iNat
3 taxon: Rapana venosa 807 observations on iNat
4 taxon: Cyperus trigynus 132 observations on iNat
5 taxon: Salvator merianae 7033 observations on iNat
6 taxon: Colaptes campestris 4627 observations on iNat
7 taxon: Teius oculatus 241 observations on iNat
8 taxon: Nierembergia aristata 130 observations on iNat
9 taxon: Ctenucha rubriceps 879 observations on iNat

# A tibble: 9 × 5
  taxon_id taxon_name            taxon_rank observations_place observations_iNat
     <int> <chr>                 <chr>                   <int>             <int>
1   419737 Xiruana               genus                      15               279
2  1028885 Excirolana armata     species                     5                16
3   370913 Rapana venosa         species                    68               807
4  1515377 Cyperus trigynus      species                    72               132
5   318758 Salvator merianae     species                   297              7033
6    18262 Colaptes campestris   species                   202              4627
7   113828 Teius oculatus        species                    44               241
8   961374 Nierembergia aristata species                     9               130
9   541510 Ctenucha rubriceps    species                    46               879

We will now merge this list with our original data (all the observations), to get for each taxon_id a number of records in the platform and a number for Uruguay.

observations_2023 <- left_join(observations_2023,
                               taxon_count_observations_2023,
                               by='taxon_id',
                               relationship='many-to-one')

Finally, let’s answer our questions.

1. A new species for iNaturalist

To answer this we will assess the field observations_iNat and check that the taxon_id’s taxon_rank is ‘species’, the observation is ‘Research Grade’ and it has only 1 observations_iNat. Those that pass this check will be new species in Uruguay for the platform!

observations_2023 %>%
  filter(taxon_rank == 'species' &
           quality_grade == 'research' &
           observations_iNat==1) %>%
  select(iconic_taxon_name,
         taxon_name, observations_iNat) %>%
  arrange(iconic_taxon_name, taxon_name) %>%
  kable()
iconic_taxon_nametaxon_nameobservations_iNat
InsectaChlaenius violatus1
InsectaEutheria piperata1
PlantaeEuphorbia burkartii1
PlantaeGrindelia linearifolia1
PlantaeHerbertia furcata1
PlantaeMikania sulcata1
PlantaeNoticastrum chebataroffii1
PlantaePavonia glutinosa1
PlantaePavonia orientalis1
PlantaeSisyrinchium rosengurttii1
PlantaeVicia montevidensis1

This means, 11 new species were recorded in the platform on 2023!

2. A new species for NaturalistaUY

In total we have 3209 species in Uruguay for 2023. Here’s a sample of the most recorded species per iconic taxon group.

observations_2023 %>%
  filter(taxon_rank == 'species') %>%
  group_by(iconic_taxon_name, taxon_name) %>%
  count() %>%
  group_by(iconic_taxon_name) %>%
  filter(n == max(n)) %>%
  kable()
iconic_taxon_nametaxon_namen
ActinopterygiiDiplodus argenteus15
AmphibiaBoana pulchella72
AnimaliaBunodosoma cangicum42
ArachnidaArgiope argentata68
AvesFurnarius rufus108
ChromistaPseudomicrothorax agilis3
FungiTrametes sanguinea31
InsectaHarmonia axyridis91
MammaliaHydrochoerus hydrochaeris60
MolluscaPachycymbiola brasiliana27
PlantaeSenecio crassiflorus165
ProtozoaFuligo septica2
ProtozoaReticularia lycoperdon2
ReptiliaSalvator merianae69
NANostoc commune4

To know how many species were recorded for the first time in Uruguay, we assessed the field observations_place. If the taxon_id’s taxon_rank is ‘species’, the observation is ‘Research Grade’ and it has only 1 observations_place, then we have a new species for Uruguay! And we actually have many 🤩

Here’s a sample of new species per iconic taxon group:

observations_2023 %>%
  filter(taxon_rank == 'species' &
           quality_grade == 'research' &
           observations_place==1) %>%
  group_by(iconic_taxon_name, taxon_name) %>%
  count() %>%
  group_by(iconic_taxon_name) %>%
  slice_head(n=3) %>%
  kable()
iconic_taxon_nametaxon_namen
ActinopterygiiAluterus monoceros1
ActinopterygiiAustrolebias alexandri1
ActinopterygiiAustrolebias melanoorus1
AnimaliaPotamotrygon falkneri1
ArachnidaAcropsopilio chilensis1
ArachnidaAkela ruricola1
ArachnidaCreugas lisei1
AvesArundinicola leucocephala1
AvesFulmarus glacialis1
AvesFulmarus glacialoides1
FungiAgaricus devoniensis1
FungiCookeina speciosa1
FungiHygrocybe flavescens1
InsectaAcledra fraterna1
InsectaAdimantus ornatissimus1
InsectaAnisophya arreguii1
MammaliaChrysocyon brachyurus1
MolluscaPomacea scalaris1
PlantaeAmaranthus caudatus1
PlantaeApium graveolens1
PlantaeAsplenium inaequilaterale1
ProtozoaReticularia lycoperdon1
ReptiliaCaiman yacare1
ReptiliaTropidurus torquatus1

In total there were 131 new species recorded in iNaturalist for Uruguay in 2023! Woooow!

And here are the top ten users contributing to these new records:

observations_2023 %>%
  filter(taxon_rank == 'species' &
           quality_grade == 'research' &
           observations_place==1) %>%
  group_by(user_login) %>%
  count() %>%
  arrange(desc(n)) %>% head(n=10) %>%
  kable()
user_loginn
enriquecenoz12
santiagomailhos11
amailhos9
luisvescia6
javierpiquillen5
gusper4
msilvera4
lautaro_fuentes3
m_coronel943
martzz3

Congrats!

And, that’s all !

Hope you find this useful too ✨

Florencia Grattarola
Florencia Grattarola
Postdoc Researcher

Uruguayan biologist doing research in macroecology and biodiversity informatics.