DECIDING WHEN TO INTERVENE

Data Interpretation Tools for Making Sediment Management Decisions Beyond Source Control

Based on a Workshop to Evaluate Data Interpretation Tools used to Make Sediment Management Decisions held at the Great Lakes Institute for Environmental Research at the University of Windsor on December 1-2, 1998

Prepared by: Gail Krantzberg, John Hartig, Lisa Maynard, Kelly Burch, and Carol Ancheta
Sediment Priority Action Committee
Great Lakes Water Quality Board

1999


APPENDIX 14

CONTAMINATED SEDIMENT: WHEN IS CLEANUP REQUIRED?
THE WASHINGTON STATE APPROACH

Teresa Michelsen
Avocet Consulting
15907 76th Place NE
Kenmore, Wasington 98028
(425) 487-6277
avocet@halcyon.com

Introduction

Washington State was the first jurisdiction in North America to adopt sediment quality criteria, including narrative standards, numeric biological effects criteria, and numeric chemical criteria. In addition, a decision framework was developed to use these criteria in deciding when to list a contaminated site, when cleanup is required, when source control is required to protect sediment, and when dredged material is unsuitable for open-water disposal. Unlike most other sediment quality criteria currently used in State and Provincial programs, these criteria are not used as screening levels, but as actual cleanup standards.

While the numeric standards originally only applied to benthic toxicity in marine sediment, the Department of Ecology (Ecology) is currently engaged in development of freshwater criteria and human health criteria, which will be incorporated into the next round of rule revisions (late 1999-early 2000). Although these numeric criteria have not yet been finalized, the decision framework is the same as for marine sediment, and is equally applicable to all environments. This framework is described below, and can be used with or without promulgated numeric criteria.

Protected endpoints

It is important to recognize that contaminated sediment has three potential pathways of concern, each of which must be considered in conducting a site investigation and selecting cleanup standards:

Existing sediment quality guidelines and interpretive frameworks (e.g., sediment quality triad) often address only benthic toxicity, whereas risk assessment approaches often focus on food web models and bioaccumulation issues. Any complete method for determining when a contaminated site requires cleanup must include consideration of all three pathways and tools that address these pathways. These three endpoints should not be played off against each other in a preponderance of evidence approach - each is a protected endpoint in and of itself, and exceedance of any one guideline should trigger action.

Tiered decision framework

A tiered approach to decision-making is the heart of the Washington State approach to determining when a site requires cleanup. In theory, for each pathway there would be three types of criteria:

The bullets above are listed in order of development; that is, an agency normally develops the narrative standard first, then translates that narrative standard into more specific effects-based criteria that can be measured in the field. Third, once enough chemical and biological data have been collected, a numeric criterion can be calculated that corresponds to the effects-based and narrative standards. For any given pathway, the agency may be in different stages of criteria development. If the more specific numeric standards have not yet been calculated, either the effects-based criteria or the narrative standards can be used to guide site-specific approaches to cleanup determinations.

At any given site, a three-tiered approach can be used, described below. Lower tiers cost less in terms of time and resources, but may be less accurate in terms of site-specific effects than higher tiers. Any of the tiers can be used to make cleanup decisions. The decision to proceed to a higher tier may be made by either the responsible party or the agency.

Tier 1 - Numeric chemical criteria. Once the numeric criteria are calculated, they can be used as a "short-cut" at smaller or less controversial sites, to save money, time, and resources. If the responsible party and the agency agree, the chemical criteria can be used directly to delineate site boundaries and set cleanup standards. For this approach to work well, the chemical criteria must be relatively accurate in predicting biological effects, rather than weighted toward the conservative side (e.g., Ontario screening levels). In other words, equal consideration must be given to false positives and false negatives, and chemical criteria calculated that have a high overall accuracy rate in predicting actual effects in the field. This is one reason that Apparent Effects Thresholds (AETs) appear less conservative when compared to approaches such as TELs/PELs, because they are designed to be used as actual cleanup standards, not as screening levels.

Tier 2 - Effects-based criteria. At any site, either the responsible party or the agency can request to conduct field measurements of biological effects in lieu of using chemical criteria. The results of these tests are then compared against the numeric effects-based criteria as in the second bullet above. These results always override the chemical criteria, because they are considered more direct measurements of adverse effects. This is true regardless of whether the chemical criteria were passed or failed.

Tier 3 - Site-specific risk assessment. If there are no effects-based criteria yet developed that are representative of the types of pathways or effects seen at the site, then the narrative standards are used to guide a site-specific ecological or human health risk assessment that addresses that specific pathway of concern.

The following sections describe the specific approach to making cleanup decisions for benthic effects and bioaccumulative risks used in Washington, under each of the three tiers.

Benthic effects

Tiers 1 and 2 are available for benthic effects in marine sediment - both numeric chemical and biological standards exist. Tier 3, site-specific risk assessment, is seldom or never used for benthic effects because adverse effects can be directly measured and compared against the numeric criteria; there is no need for modeling or probabilistic approaches.

Under Tier 1, AETs are used as chemical criteria. At least 4 AETs are calculated for each chemical, each of which represents a different species or biological test. AETs currently promulgated include the amphipod Rhepoxynius abronius acute bioassay, oyster larvae survival and abnormal development test, Microtox, and benthic effects. AETs have also been recently calculated for the echinoderm Dendraster excentricus larval bioassay, and the Neanthes arenaceodentata growth test. The lowest of the AETs is used as the long-term goal for sediment quality in the State, and the second-lowest AET is used as an upper limit for cleanup. A site-specific cleanup level is selected as close as possible to the long-term goal, but no higher than the second-lowest AET. This gives site managers some flexibility to address site-specific conditions of cost, feasibility, and net environmental benefit.

As an alternative to using chemical standards, Tier 2 biological effects levels may be used. Under Tier 2, a responsible party must conduct a suite of 2 acute and 1 chronic biological tests from an approved list of bioassays and benthic community studies, and compares the results of these tests to the promulgated biological criteria. For each approved test, Ecology has defined two levels of impact. The lower level of impacts typically corresponds to the minimum detectable difference in comparison to a reference station. A higher level of impact might be 30-50% adverse effects such as mortality, reduction in growth, or abnormal development. The results of the bioassays are scored against these levels of impact for each station tested. If two tests show a low-level impact, or one test shows a high-level impact, then that station is considered to exceed the cleanup level. All the stations showing impacts are mapped, and a cleanup boundary selected that includes the impacted stations.

For freshwater sites, numeric chemical and biological effects standards are not yet promulgated. Draft biological effects standards for two freshwater bioassays have been developed and included in the Dredged Material Evaluation Framework for the Lower Columbia River Management Area. Ecology's regulatory workgroup is considering these and additional biological standards for inclusion in the next round of rule revisions. The draft biological criteria are presented below. In the mean-time, site managers are selecting appropriate freshwater bioassays and determining site-specific biological effects criteria for comparison to field data, using criteria and decision frameworks analogous to the marine criteria.


Bioassay Low-Level Effect High-Level Effect

Amphipod Hyalella azteca (10-day mortality test) Mortality greater than reference stationa Mortality 15% higher than reference mortality
Midge Chironomus tentans (10-day mortality test and 10-day growth test) Mortality greater than reference stationa and Biomass less than reference station Mortality 20% higher than reference mortality and Biomass 60% of reference biomassa

a. Difference must be statistically significant (alpha = 0.05).

Draft freshwater AETs have also been calculated for Hyalella azteca and Microtox, but there are not yet enough data to calculate AETs for other tests. Because at least 4 AETs are needed to promulgate numeric chemical cleanup standards, more data will be needed before chemical criteria can be published.

Bioaccumulative effects

Bioaccumulative effects are of concern for both people and fish/wildlife. Because these criteria are calculated in essentially the same way, both are treated together here. All three tiers are under development for the next round of rule revision in Washington. Tier 1 would consist of specific sediment quality criteria that were developed using bioaccumulation models back-calculated to sediment. These are derived in the following manner:

Draft Tier 1 sediment quality criteria have been developed by Ecology and are under consideration for promulgation in the next round of rule revision. Similar to benthic effects criteria, a range of acceptable sediment quality levels will probably be derived (for example, based on a range of 1x10-6 to 1x10-5 carcinogenic risks in humans) that would give site managers some flexibility in selecting cleanup standards at a site.

Because there is currently still a great deal of uncertainty in the BSAF portion of the model, Tier 2 will likely consist of using the TTLs directly as effects-based criteria. A responsible party could collect fish and shellfish from the site, or conduct laboratory or in situ bioaccumulation tests, and compare the measured tissue levels with the TTLs directly. If the TTLs are exceeded, the results of these tests could also be used along with surface sediment data to derive a site-specific BSAF, which could then be used to back-calculate site-specific sediment quality criteria.

Tier 3 would consist of site-specific food web modeling for ecological risks, or use of site-specific human health exposure scenarios, if unusual receptors or exposure pathways existed at the site.

Summary

In summary, in Washington State, cleanup decisions are made on the basis of both benthic toxicity and bioaccumulation pathways. For each pathway, several tiers are available for determining whether the level of risk warrants cleanup, ranging from numeric sediment cleanup criteria, to biological effects-based criteria, to site-specific risk assessments. The results of higher tiers, being more resource-intensive and more site-specific, always override the results of lower tiers. The decision to proceed to a higher tier may be made by either the responsible party or the agency, and may depend on the size and complexity of the site, the potential for unusual bioavailability or exposure issues, and the resources at risk at the site. Higher tiers are always used if chemicals or exposure pathways are present that are not represented by the numeric chemical or biological criteria available for lower tiers.