Changes in Bee Species Status in Washington D.C. area (Part1 of 3)

28 views

Skip to first unread message

Sam Droege

unread,

Sep 15, 2025, 9:38:44 AMSep 15

to beemon...@googlegroups.com

"...and though the holes were rather small, they had to count them all..."

A day in the life.

Sargent Peppers

Beatles

I have been thinking recently about the status of bees in the world. In the media and within the introductions and discussion sections of reports and scientific papers the status of bees usually gets summarized as a story of decline and disappearance. This is true, trivial really, as when you replace plant communities largely maintained by Nature with those largely maintained by man most species disappear and that is just what we have done. At the extreme are lands completely deleted from Earth using concrete, asphalt, and buildings followed by greatly degraded losses due to agriculture, continuously mown landscapes, and the many ways we try to tidy up Nature. There are bees and other creatures that can persist and even thrive in those landscapes but most cannot.

Yet bees do persist. There are many parks, private wildlands, and just places we haven't had time to completely corrupt that clearly retain these less tolerant bee species and the plants they depend upon. So the question of bee decline is both apocalyptic (yes, there is massive species loss in human dominated landforms) and squishy (are all the bee species we had hundreds of years ago still present when we look within the residue of natural landscapes?).

Figuring out the squishy part is really important. We could intellectually foreclose on the human landscape if the bank of species (so to speak) is still ready for withdrawal in our wildish lands.

Is it?

Well, we have no national inventory or monitoring program to consult, but there are now several state efforts and past information on bee status resides in museums (who needs those right?).

Those old records can (and many have been) compiled and can be compared to recent collections (many of which have also been compiled). Several papers and reports have done this.

There are intellectual issues to contend with however. In the perfect world, any set of specimens (old or new) would contain all the bee species present in an area and the numbers collected reflect the real number present or at least the correct ratios of their real commonness. This is at least sort of true. A species that is common has a higher probability of being detected than a rare on for sure. But there are problems in probability land.

"Can you do addition?" the White Queen asked.

"What's one and one and one and one and one and one and one and one and one and one?"

"I don't know," said Alice. "I lost count."

Lewis Carroll - Through the Looking Glass.

OK, well and good. Lets just ignore the inconvenience of probability land for the moment and gather all of the available bee data from the past and present and run a regression line and see how our bees are doing. Done. But does that analysis reflect how bee populations have changed?

Nope.

Probability land is a land of corruption and our regression lines usually do not reflect trends in the true populations because of that corruption. Without correcting, qualifying, and interpreting the results through both a human and bee behavior lens then the answer is a hard no to a straight regression, and its subsequent reporting and conclusions. The problem comes down to probabilities. Those darn probabilities of detection for a species and their shifts over time are a corrupting force in our search for answers to "are bees in trouble".

So, this deserves some explanation. Below are some of the cloaking factors that I can think of that impact the relationship between specimens in a database and the true number of bees out there in the wild. In other words alterations to any of these factors will skew bee counts away from reflecting the real population of bees and therefore impact our conclusions about how bees are doing. We want our dataset on bees to reflect changes in bee populations over time not some unknownable combination of changes in time of bee populations and changes in the probabilities of detecting those bees.

I believe these are some of the primary factors:

Time: If you spend more time (days/hours) trying to catch bees you will catch more, spend less time and you catch fewer (I almost said "less", a grievous grammatical error that sadly is no longer being enforced).

Date: If you change the dates you go out looking for bees you will get different species and counts of bees.

Technique: Catching bees with a net, malaise trap, bowl trap, vane trap etc. favors the capture of some bees and unfavors the capture of others.

Experience: An experienced person will catch more bees and different species of bees than an inexperienced person, this is most obvious when comparing netting results. And. Even experienced people differ in how they approach the capture of bees and each collector will favor the capture of different species of bees and numbers of those species depending on their proclivities.

Retention: This one is not often thought about, for sure. If you either avoid capturing certain species (do people really capture every honey and bumble bee they see?) in the field or pitch them after you have brought them back to your sorting table (do I really need another Augochlorella aurata in my collection?) or never identify them (e.g., I still don't identify most Lasioglossum males in the Dialictus group to species....) you impact the resulting "counts" that would be used to calculate change.

Taxonomy and Identification: Through time species are lumped together as well as split into new species. In some cases this means that identifications in the literature or a database can't be safely ascribed to species prior to analysis and must be dropped or lumped into "groups".

Location: Bee species are not distributed evenly across any state, county, city or even within a single field. Some bee species are primarily found in fields, some woods, some beaches, some follow rivers, some only on mountain tops, and a few reach peak abundance in urban areas ... you get the point. One would want your long-term dataset to sample evenly across these habitats, or at least some subset consistently. Usually, if you do it right, you get close to this if you have a true monitoring program. But again, we don't have any monitoring programs for bees out there. We have a collection of data points (occurrence data) collected for all sorts of reasons, using all sorts of techniques, on different dates, and different places by different people. If, for example, long ago people sampled mostly near towns but recent people sampled throughout the state, comparisons may require restricting the comparative area to only those that are sampled consistently. If in time one, people sampled in woodlands and time two agriculture fields what would snapping a trend line through those data points tell you?

In the United States, where we have both old and new data inhabiting our databases, all these factors come to play and our interpretation of status and change becomes tricky. It is easy enough to simply pooh pooh (what is the proper spelling here of pooh pooh?) any such analysis and walk away, but I think there is an analysis path forward that provides insight into how our bees are doing. Or. At least provides grounds for hypotheses, targets for data collection efforts, and conservative lists of species of concern.

Much of my musing here comes from our regional work to document the bees of Maryland and the District of Columbia. We have plenty of old and new data for the roughly 450 species of bees found so far in Maryland and DC. We have enough data now to document what the common bees are, but a veil starts descending as the recent records for individual species become fewer and fewer and we see that some species were found in the old days but are not found now.

Our goal for what follows is to see if we can glean understanding (broad or narrow) regarding changes in bee populations in the Washington D.C. area using the data available.

The Three Subregions

Consistent historic data on bee species in Maryland and Washington D.C. (DC) only exist for DC and the Maryland counties of Montgomery (MOCO) and Prince George's (PG). The reason we have a lot of past data for these area is due to the extensive presence of government and private collectors going back to the late 1800s. It is certainly one of the best collected regions in the country.

Three sources of data were used.

GBIF: This dataset represents specimens recorded in Global Biodiversity Information Facility's (GBIF) database from the year 2000 and earlier.

BIML: This dataset represents specimens found in the USGS/FWS Bee Lab (BIML) database from 2001 until present.

Other: Additional data are available from many other sources (local collectors, iNaturalist, literature, University Collections) and due to their problems (detection probabilities you know) are used in a limited way (but important!).

What would appear to be the most potentially informative information in these datasets are comparisons between GBIF and BIML, but only using the subset of netting data from BIML. Other data are also surprisingly useful and brought into the discussion to illuminate species that are/were present but not detected within the GBIF/BIML nettiverse.

Descriptions of the Datasets

GBIF: Bee data were downloaded from the GBIF website. Data with collection dates after 2000 were discarded. The remaining data represent bee species found by netting (it is possible there was some malaise samples in there somewhere, but we have no evidence for that). It turns out that many bees were collected in the DC area but ended up being deposited in collections from around the country. Many collectors were involved. There were 87 different years with data and every decade was represented.

Where as, as you will see, a lot of DC region data are available in GBIF, the Smithsonian's Natural History Collection (NMNH) is under-represented. Only bumble bees at NMNH have been databased. Databasing of the rest of the collection is in the works, but nothing more is available at this point.

While we know that there are bee specimens at NMNH that are not in the GBIF database we also can safely presume there to be AWOL specimens in collections around the country. In addition to the fact that some historic data will not be in the database we have to keep in mind that rare species were largely kept by the old collectors and common species often either passed over in the field or discarded prior to pinning.

BIML: Bee data found in the BIML database come from many sources. Much of the data can be tagged directly to Bee Lab activities but many other groups in collaboration with the Bee Lab have their data deposited in BIML after BIML staff have looked over the identifications. A diverse set of techniques are also represented (e.g., netting, bowl traps, glycol traps, vane traps, bucket traps, malaise traps). All specimens captured were identified to species and entered into the database, though some data still await identifications. This is quite the contrast with the GBIF data which, as noted, would be highly biased towards rare species. On the positive side the netting data would have been collected in approximately the same way as that in GBIF. When netting bees the netter is basically using their understanding of patterns of occurrence and floral association to target bee-important flowering vegetation. The Bee Lab and associates never had a specific project to survey the bees of Maryland and DC, but they would devote time to collecting in different regions of the state when time permitted during the workweek and on weekends and holidays.

One way to help diminish the problem of bias in counts between these two time periods is to use the number of days a species was collected rather than the total number of bees collected. Recall that most bees in the GBIF dataset would have been discarded but all bees in the BIML dataset were kept, though when netting only a sample of the bumble bees, honey bees, and carpenter bees taken for practical reasons. An assumption with the use of days rather than counts is that common bees will still show up on many days in both datasets reflecting their commonness (though, as will be explained later, there are clear problems with this assumption for some species groups) with rarer bees will having even less sampling bias. I will use only the number of days a species was collected in subsequent analyses.

The patterns of sampling dates are are listed below (note "trapping dates" refers to all the data that are collecting using something other than a net in the BIML data set):

DC GBIF 75 Netting Dates
DC BIML 85 Netting Dates
DC BIML 196 Trapping Dates
MOCO GBIF 336 Netting Dates
MOCO BIML 30 Netting Dates
MOCO BIML 75 Trapping Dates
PG GBIF 171 Netting Dates
PG BIML 277 Netting Dates
PG BIML 585 Trapping dates

Recall that GBIF data represent over 100 years of sampling and that BIML data only 25 years. The two different time lengths are not a problem per se, but interpretations of the result have to keep them in mind and the greater year range in the pooled historic data muffles what are likely some interesting changes. Turns out that things change over a 100 years and the present analysis hides those changes, but then again it also helps flattens some of the rise and fall of population numbers when using the entire time period. I felt the GBIF data were sparse enough that dividing the data into more time periods would be a problem and a complication, so I leave it to a more clever person to come up with a better approach (if you are that clever person, I would be happy to give you all our data, oh, and even if you are not a clever person I am still glad to give you our data).

Ok, a quick inspection of the number of netting dates for each subregion shows sometimes sharp differences in the number of sampling dates involved. No surprise. But. While these differences obviously (in addition to several other factors) preclude direct comparisons if we involve our numerical friends: ratio and proportion some of the problems are diminished. We will be using our friends in detail in a later section.

This is a good lead into talking about where sampling has taken place in the subregions over the years. Two major regional factors contribute to changes to the distribution of bees in the region during the last 125+ years. These are an increase in human density and the concomitant increase in urban environments created by those humans and a shift in the residual natural areas from open landscapes towards a more wooded environments with greater maturity the trees in those forests.

Historically, early collectors used the trolly networks in the region to go collecting. Their targets appeared to be the residual farms and open country of eastern Washington D.C. and the habitat along the Anacostia and Potomac Rivers. The Anacostia originally contained extensive freshwater tidal marshes (most of which became landfills at some point) and the Potomac has largely remained the same except that the area has become much more heavily wooded. One of these trolly systems ran along the Potomac to Glen Echo Park (an early Amusement Park) in Montgomery County. As can be seen by label information, this trolley was clearly used by naturalists on their collection trips to the Chain Bridge flats in Washington D.C., Plummer's Island (home to the Washington Biologist's Field Club) just outside of Washington D.C., and the Glen Echo area. The landscape of all three of these areas originally contained extensive areas of scrub and open landscapes, while most but not all, are now heavily forested (Chain Bridge Flats is still scrubby due to flood scour by the Potomac River on its low rocky shores). In 1910 the USDA Beltsville Agricultural Research Center was established in Prince George's County and in 1936 Patuxent Wildlife Research Center was established and also became centers for collecting.

Similar to early naturalists, recent collections by those associated with the USGS Bee Lab sought out natural areas in the region. The Bee Lab was originally located at Patuxent Wildlife Research Center (now the USGS Eastern Ecological Research Center), was moved to the adjacent USDA Beltsville Agriculture Center, and then moved back to Patuxent. I collected extensively near my home along the Patuxent River near Upper Marlboro as well as at the Bee Lab locations. Collecting occurred regularly throughout Washington D.C. by myself and others but less commonly so in Montgomery County compared to the past. Traps were often used during the past 25 years (856 dates among the 3 subregions) much more so than netting (392 subregion/dates) surpassing even the the historic netting efforts (582 subregion/dates). Evaluation of changes to populations will concentrate on comparisons of netting data, but trapping information as well as records from the literature will be used to gain perspective on the netting results.

(The next section will have results using a comparison of species lists)

Douglas Yanega

unread,

Sep 15, 2025, 2:05:06 PMSep 15

to EC...@ecnweb.org, beemon...@googlegroups.com

I have no faults to find with Sam's excellent analysis of the situation with assessment of species changes over time, though from very extensive personal experience, I feel it is worth highlighting two factors that are perhaps not emphasized enough.

On 9/15/25 6:48 AM, Sam Droege wrote:

Experience: An experienced person will catch more bees and different species of bees than an inexperienced person, this is most obvious when comparing netting results. And. Even experienced people differ in how they approach the capture of bees and each collector will favor the capture of different species of bees and numbers of those species depending on their proclivities.

Retention: This one is not often thought about, for sure. If you either avoid capturing certain species (do people really capture every honey and bumble bee they see?) in the field or pitch them after you have brought them back to your sorting table (do I really need another Augochlorella aurata in my collection?) or never identify them (e.g., I still don't identify most Lasioglossum males in the Dialictus group to species....) you impact the resulting "counts" that would be used to calculate change.

I've worked extensively with the personal collections of some of the most prolific bee collectors in the US, such as Cockerell, Timberkale, Michener, Rozen, and Laberge. Their collections are all extremely biased, and biased in the same general ways.

The easiest way to explain it is this: (1) each had a taxonomic bias, such that a disproportionately large amount of collecting focused on bees of a certain family or genus (or habit), independent of their actual or relative abundance. Timberlake, for example, collected almost exclusively at flowers, and his collection contains a surprisingly small number and diversity of cuckoo bees, which are much more commonly encountered at nest sites; Rozen's collection is the opposite, with cuckoo bees representing a disproportionately large percentage. (2) All of those historic collectors demonstrably ignored common and/or easily-recognized taxa and focused on rare or taxonomically difficult taxa. Timberlake collected over 200,000 bee specimens, of which only 5 were Apis mellifera (in contrast, the type series of numerous of Timberlake's species - many known ONLY from the type series - number in the dozens to hundreds, collected on a single plant on a single day, because he knew they were potentially new taxa). The more common and abundant a species is, the less likely an expert collector is to collect it, and the converse. I follow this rule myself, ignoring or discarding almost any bee I encounter that I can ID to species on sight, and I know is not rare or interesting. Overall, I estimate that I collect and retain only about 10% of the bees I encounter (that percentage has dropped over the decades of my career, too), and I suspect that the same was true for these other folks.

These sources of bias are profound, idiosyncratic, and impossible to adequately compensate for in any post-hoc analysis making use of historic records. Yet, they comprise a very significant proportion OF the historic records. Museum records are irreplaceable, but come with severe limitations that are not immediately obvious.

Three sources of data were used.

GBIF: This dataset represents specimens recorded in Global Biodiversity Information Facility's (GBIF) database from the year 2000 and earlier.

This point, about the use of GBIF records, is the other source of concern that needs to be emphasized. GBIF harvests data from a variety of sources, and very few sources are screened for quality control. In certain cases, one can find data sources (often considered "reputable") for which nearly half of the data points are assigned georeferences that are wildly inaccurate, anywhere from 10 miles off to several hundred miles, because of over-reliance on automated georeferencing tools and a lack of any protocol for confirming results. Certain institutions are much more prone to these errors, because of the implementation of georeferencing protocols with little or no oversight, resulting in records like this one: https://www.gbif.org/occurrence/657731871 which is almost 700 miles away from the locality specifically indicated on the label. In every major US institution (literally across all institutions, personally confirmed), there are also between 10-20% of all pre-2000 records, for which the original labels on the specimens are inaccurate or (more often) data-deficient, also leading to mapping errors (note also that bad original labels also evade detection when crowdsourcing is used as a QC step). These bad records appear in GBIF, and can lead to dramatically erroneous conclusions regarding the distributions of certain species; this is especially bad when the species are of potential conservation concern (as with errors in 24 of the 49 records for Bombus caliginosus from the KSEM collection, one of which is given above), and agencies like the USFWS use bad data to build their maps.

The use of captured data for museum specimens is only useful if the data are accurate, and very few institutions double-check their captured data for accuracy. This is an independent source of error from misidentifications, which can also create significant problems.

Peace,

-- 
Doug Yanega      Dept. of Entomology       Entomology Research Museum
Univ. of California, Riverside, CA 92521-0314  voicemail:951-827-8704
FaceBook: Doug Yanega (disclaimer: opinions are mine, not UCR's)
             https://faculty.ucr.edu/~heraty/yanega.html
  "There are some enterprises in which a careful disorderliness
        is the true method" - Herman Melville, Moby Dick, Chap. 82

Reply all

Reply to author

Forward

0 new messages