Mandala Projects > Home > ESP > Case Studies > This Case Study
 

The Benzene Case Study:

Using Right-to-Know Data to Answer Important Environmental Policy Questions

paper prepared for the

ESP Case Study #2 May, 1999 by

Dr. James R. Lee
Ms. Anna Jung

Environment, Statistics and Policy (ESP) Project

American University

Table of Contents

Executive Summary

This report embodies a case study carried out by a (fictitious) organization that represents public environmental interests, in particular the dangers from the chemical benzene. Benzene Dumping in the Baltimore Area. The BENZENE case study is predicated on the need to estimate environmental clean-up costs based on emissions aggregates for benzene- like compounds.

1. Tell the user more about the technical aspects of using the data.

2. Make the data easier to use.

3. Create and make available time series data.

4. Provide a more useful context for the data.


 

I. The Problem

Public health and environmental health advocates both advocate clean-up of harmful chemicals such as benzene. This is a problem in Baltimore where several industries produce benzene as a by product. Not only is benzene a threat to human health, it is also a danger to the health of the Chesapeake Bay, since much of the benzene is washed or drained into the Bay where is lies in deposits at the bottom.

Prior to even considering policy mechanisms to solve problems of benzene pollution, there needs to be an estimate of how much benzene has actually been released into the environment.

The purpose of this case study is to determine the extent to which the publicly available date released on the Web can be used to estimate magnitudes or aggregates of pollutants released into the environment. We want to know to know how much benzene has been dumped in the city of Baltimore as far back in time as possible.

 

II. Research Approach

The estimate the level of benzene emissions, the RTKNET site was used to search for benzene releases. Since the Toxics Releases Inventory is the oldest of the datasets available, this will give the longest time line (1987-1996).

The TRI data that is on this site can be sorted by a specific chemical, such as benzene, but this is not the case with most other RTK data sets. The TRI is a more comprehensive database, since most others that are limited to single media reporting.

This search was taken from RTK NET's (the Right-To-Know Network)'s copy of EPA's TRI database. RTK NET is run by OMB Watch and Unison Institute at 1742 Connecticut Ave., NW, Washington DC, 20009 - Phone: 202-234-8494. The search was done on 02/09/1999. Here are the input criteria (see Table 1).

Table 1
AREA REPORT ( TRI DATA )
search used- range
Zip Code ALL
City BALTIMORE
County ALL
State MD
Chemical BENZ* [BENZENE, BENZOYL PEROXIDE]
CAS ALL
Year ALL
Level of Detail HIGH
Output Type Text
Sort Order Facility name

III. Results

This section discusses right-to-know data, where to find it, and how to get it.

A. Using Right-to-Know Data

We accessed environmental available to the public through "Right-to- Know" legislation. RTK.NET is the major private "The-Right-to-Know" Web site, run by a private organization affiliated with OMB Watch and the Unison Institute. RTK.NET uses information from EPA, so RTK data appears to cover about the same categories of reporting as Envirofacts.

Another level of sorting options must be made available to the user to make the system more useable. Further, many user choices could be expanded and other "user-friendly" features added. Here is a brief description of the process of getting data.

The RTK.NET web site is very accessible, and the options for sorting by geographic locale are quite easy. Actually managing and utilizing the data is another matter. The system does have a nice feature which sends the data for the chosen geographic locale to your email address.

The data we recieved is an "ascii" file, which you can easily download from Netscape email using the "save as" option. The file contains more than 100 categories and can be transmitted in either TAB or COMMA delimited format. From there one can import the files into Quattro Pro or Excell (and presumably SPSS and SAS) for data analysis.

B. What Does the Data Tell Us?

Most of the benzene releases occurred during the earlier part of the time period and most of these releases were in the air (fugitive and stack). Therefore, the dispersal of benzene compounds may have spread over a large area. However, most of the area would still be within the watershed area, so that the benzene emissions might eventually wind up in the Chesapeake Bay.


 

Table 2
Data in pounds
Year
Releases
Transfers
Total Production
1987
54,000
11,800
65,800
1988
90,000
7,800
97,000
1989
80,500
8,800
93,300
1990
72,590
11,010
83,600
1991
68,794
7,799
76,593
1992
39,877
8,417
48,294
1993
4,388
65,595
69,983
1994
3,655
91,605
95,260
1995
1,947
106,830
108,777
1996
2,323
28,920
31,243
Totals
418,074
348,576
766,650

 


Between 1987 and 1996 there were over 400,000 pounds of benzene related compounds released from facilities within the city of Baltimore. These were mostly releases into the atmosphere. The vast majority of the transfers are out of state, with large transfer destinations occurring in states such as New York, Kentucky, Pennsylvania, and South Carolina.

C. Comparable and Additional EPA Data

Through examination of comparable and additional data, it is possible to provide context for the BENZENE case.

1. Comparable Data on the Web: Envirofacts

We attempted to access the same data set using EPA's Envirofacts (EF) Web system, in an initial effort in comparing data sets on water discharges.

2. Additional Data on the Web

a. EPA's Surf Your Watershed

Data on water discharges is only part of the overall story in explaining other factors present in the pfiesteria outbreak on the Pocomoke River. In probing deeper into these other issues, we would also need to obtain data from Storet(x) which details the volume of water discharges and therefore provides context for the PCS discharge data. Watershed data (Surf Your Watershed) may also provide critical background information.

b. FedStat

We also used FedStat, which is a collection of databases from many Federal agencies, to seek out Benzene dumping elsewhere.

IV. Recommendations

Here are four recommendations about improving public access to and use of right-to-know
environmental data.

1. Tell the user more about the technical aspects of using the data.

There is simply not enough information available on the process of downloading and utilizing data from either Web site. We use Quattro Pro on the AU system. The default on the RTK.NET system is tab delimited format, although Quattro Pro supports a comma delimited format. We unfortunately discovered this the hard way. There should be an explanation of how to actually manage the data in various software packages as well as introductory instruction in analyzing it. At the user end, there should be a user-friendly choice of downloading the data in readily accessible formats (for example, Quattro Pro, Excell, Word, etc.).

2. Make the data easier to use.

The data is presented in a random way that confuses the user as to order of information types. For example, the data fields in the files when downloaded are not accompanied by the data headers when imported, which means these must be imported from another file or typed in by hand. Included in the e-mailed data set, there is a hyper-link for the header categories, but this step serves as an additional obstacle for the user to solve as well as another potential source of error in data use.

3. There needs to be readily-useable time series data available.

There should also be a means by which to discriminate data by time, as that is a feature which will be of constant concern. Data will naturally need to be examined in terms of periodicity. This information is determinable, but is not easily attainable in the current data offering on the Web sites.

4. Provide a more useful context for the data.

There is context for the data, but it is often at levels too disparate from the level of data. In the Pfiesteria case study, there was a context, but the specific locations of the point source data could not link up to the eco-system level data of the context. There must be some discrimination in eco-system levels and scopes to provide a link to the point-source data.

Appendix 1

PCS Data Field Explanations

Next Steps


We think one way to explore this case study is to follow-up on this trail of discovery by turning attention away from somewhat sophisticated use by researchers to the problems of providing accesible data that can be used. Therefore, we suggest the case study continue, but this time from a focus of the Pfiesteria case study within EPA itself. The case study could serve as a simulation to test the legal limits of the information.

We propose a project that will both educate and examine "right-to-know" (RTK) consumer data that is now available on the Web. The Educating and Evaluating RTK project (EE-RTK) would use students in assessing and using right-to-know data. Not only would it provide valuable feedback on the use and misuse of the data, it can also serve as a basis for developing the elements of a class built around this subject.

Appendix 3

Nine Database Quality Questions


This case study constitutes a good basis from which to answer the "Nine Database Quality Questions" which form the basis for review of PCS and other EPA databases. Our approach to the subject is as scientists. We believe that the level of accuracy requires an assumption of proof, this for attaining reasonable scientific findings and for the legal reasons that flow from scientific findings, especially those based on statistics. We will answer these questions from using the data in the context of an academic researcher, one therfore whose findings would be sufficient to stand as an expert witness in a court case or proof of statistical relationship. We also assume that the data is publically available and began with use of a non- profit user of EPA data.

1. How Comprehensive is the Database?

Unknown. As a case study, comprehensiveness was antithetical to the scope of the research.

2. Can the Database Be Used for Spatial Analysis?

Maybe. There are spatial variables in the database. However, it is unknown as to its geographic exactness to produce cause and effect. Is the address the report for the site of an event, the site of the nearest post office, or the corporate headquarters filing the report? Likewise, do municipal variables refer to the location of the event of the government office responding to the request? Furthermore, there are distinct state-by-state reporting characteristics that were found in this case that need to be addressed.

3. Can the Database be Used for Temporal Analysis?

No. Publically-available PCS contains inadequate data for even constructing a time series, a key find of our report. This data does exist but the data on the Web, and thus publically-available, has only limited time series indications. This is a function of both funding and protection of business and privacy interests.

4. How Consistent Are the Variables Over Space and Time?

Not enough. Time is distorted and space may be limited in the dataset.

5. Can Data Be Linked with Information from other Databases?

Absolutely. We were able to use PCS along with other data through facility reports provided, although were not publically-available data sets.

6. How Accurate are the Data?

We did not investigate this.

7. What are the Limitations?

Is the data that is now available on the Internet of sufficient quality for scientific examination? At the moment, the answer is no.

8. How Can I Get Information?

Any Internet account with a search engine can find the data. We did not investigate ordering the data by phone in hard copy.

9. Is There Documentation?

Yes, but not very accessible.

Important Links to Related Sites

August, 1998

home | help & faq | ted | ice | global classroom | in development | site map

This site was conceived of by Dr. James R. Lee, jlee@american.edu
American University, The School of International Service
4400 Massachusetts Ave. NW. Washington, DC 20016-8071