First steps with the iNaturalist API

How many users in NaturalistaUY are Uruguayan?

I was interested to know how many of the users generating records in NaturalistaUY (the Uruguayan site of iNaturalist), are, in fact, Uruguayan users. With Rodrigo Montiel, we are assessing the profile of observers in Uruguay (check his repo here). So, we need to detect those users that are not from Uruguay and remove the data generated by them from our dataset. This was a perfect use case to test the iNat API, which, so far, I haven’t explored.

Basically, this API enables you to query the iNaturalist database by using different methods, parameters and values. For instance, you can search and fetch data on observers, such as number of observations (observation_count) or number of species observed (species_count), by providing a username (user_login). The result is a JSON that you can parse and analyse.

Let’s get on with it!

To start with an example, I’m going to use my own username (flo_grattarola) as value for the user_login parameter to do a query, using GET /observations/observers (see it in the API), which ‘returns observers of observations matching the search criteria and the count of observations and distinct taxa of rank species they have observed’.

So, by running the following query:


We get as output, the following data in JSON format:

  "total_results": 1,
  "page": 1,
  "per_page": 500,
  "results": [
      "user_id": 736016,
      "observation_count": 3649,
      "species_count": 1246,
      "user": {
        "id": 736016,
        "login": "flo_grattarola",
        "spam": false,
        "suspended": false,
        "created_at": "2017-12-15T15:54:34+00:00",
        "login_autocomplete": "flo_grattarola",
        "login_exact": "flo_grattarola",
        "name": "Florencia Grattarola",
        "name_autocomplete": "Florencia Grattarola",
        "orcid": "",
        "icon": "",
        "observations_count": 3649,
        "identifications_count": 5779,
        "journal_posts_count": 1,
        "activity_count": 9429,
        "species_count": 1419,
        "universal_search_rank": 3649,
        "roles": [
        "site_id": 28,
        "icon_url": ""

Great! This query gives us all the data we need, especially the observation_count value for each user.

Assessing users in NaturalistaUY

First, we need to download the data from (ours was downloaded on 2022-10-27). Then, we calculate the number of observations made by each user in Uruguay (observation_count_UY), and keep the user_id and user_login variables. You can do this by using functions group_by() and count(). You could also do this using the API, but in our case, we already had the data.

We found a total of 1,788 users in NaturalistaUY. The user with largest number of records has 4,755 observations and on average users have uploaded 29.9 records to iNat. Here’s a glance of the data:


Create a function to run the query for multiple users

The idea now is to be able to run the test query from above for all the users of NaturalistaUY, get their observation_counts and compare them with the number of observations these users have in Uruguay (i.e., proportion of observations recorded in Uruguay vs in the rest of the world).

The function get_observers_num_observations() takes a list of users (user_login) and returns a tibble with user_id, user_login and observation_count_iNat. This last count is the total number of observations of the users in the platform.

An important consideration to using this API is that we could overflow it by querying all users together, as iNat limits the API usage to a max of 100 requests per minute. So, we need to create a delay in the fetching process. We will do this by pausing the query for 10 seconds every ten rows, using Sys.sleep().

Here’s the function:


get_observers_num_observations <- function(user_login_list){
  observers_num_observations <- tibble(user_id = numeric(),
                                       user_login = character(),
                                       observation_count_iNat = numeric())

  num_results <- 1  # se usa para dormir la llamada a la API y para imprimir en consola el progreso

  for (user_login in user_login_list) {

    if ((num_results %% 10) + 10 == 10) {
      Sys.sleep(10) # La API necesita un delay porque si no da error. Cada 10 users, el código para 10 segundos

    call <- paste0("", user_login)

    get_json_call <- GET(url = call) %>%
      content(as = "text") %>% fromJSON(flatten = TRUE)

    if (is.null(get_json_call)) {
      observer_num_observations <- tibble(user_id = NA,
                                          user_login = user_login,
                                          observation_count_iNat = NA)
      observers_num_observations <- rbind(observers_num_observations, observer_num_observations)
      cat(num_results, 'usuario:', user_login, '--> NOT FOUND', '\n')
    else {
      results <- as_tibble(get_json_call$results)
      observer_num_observations <- tibble(user_id = results$user_id,
                                          user_login = results$user.login,
                                          observation_count_iNat = results$observation_count)

      observers_num_observations <- rbind(observers_num_observations, observer_num_observations)
      cat(num_results, 'usuario:', user_login, '--> DONE', '\n')
    num_results <- nrow(observers_num_observations) + 1

It is probably written in a too complicated way, but it does the job 🥹

Let’s run it

To run the function, we need to provide a list of users’ user_logins. Let’s use as an example the previous list.

 [1] "noelia"              "romigaleota"         "goncrisdi"          
 [4] "jorgejuanrueda"      "beln15"              "ceciliapomboposente"
 [7] "patriciabidondo"     "weba69"              "bert_in_the_skirt"  
[10] "vanesssa_v"         

When we run it, the function prints in the console the users’ user_login that it is assessing, so we can have an idea of the progress.

NatUY_users_assessment <- get_observers_num_observations(NatUY_users_selection$user_login)

If the user is found it will print the name and --> DONE, while if it’s not found, it will return --> NOT FOUND.

1 usuario: noelia --> DONE
2 usuario: romigaleota --> DONE
3 usuario: goncrisdi --> DONE
4 usuario: jorgejuanrueda --> DONE
5 usuario: beln15 --> DONE
6 usuario: ceciliapomboposente --> DONE
7 usuario: patriciabidondo --> DONE
8 usuario: weba69 --> DONE
9 usuario: bert_in_the_skirt --> DONE
10 usuario: vanesssa_v --> DONE

How do results look?

In the end, we get a table with the counts.


Finally, we merge the results with our original table and count the proportion of records from each user that were recorded in Uruguay. We can even make a guess of who is Uruguayan (those that did more than %30 of their observations in Uruguay), see variable esUruguaye.



From the total of 1,788 users, 1,282 are Uruguayans (i.e., have recorded more than 1/3 of their observations in Uruguay), while 517 are not. We also found 11 users that seem to have deleted their accounts in the platform and, thus, they were not found.

That’s all folks!

Florencia Grattarola
Florencia Grattarola
Postdoc Researcher

Uruguayan biologist doing research in macroecology and biodiversity informatics.