Assistant Professor Ericka Menchen-Trevino's research and teaching interests lie at the intersection of political communication and digital media studies, with a focus on methodology. Over the course of her career, she has worked as a research lab manager, an ethnographic research consultant, a grant writer, a technical writer, a web designer, and a technical support representative. She is working to advance research methodology, particularly mixed methods approaches, in the digital era conceptually and by developing research software, Web Historian. We talked with her about Web Historian and how it’s useful for researchers and anyone who uses the internet.
Why did you decide to focus Web Historian on overall web usage and not just social media?
As you know, when you are on social media it directs you to other websites, Web Historian can capture that social media not excluding it just putting it into a bigger context. The history menu on your browser has all the websites you’ve visited---that’s the data I’m using.
How long did it take to develop Web Historian?
I had to teach myself Java Script. To do it today would be less time because I had to learn the program that it was in. I came up with the initial idea two-and-a-half to three years ago, and it existed in prototype form a little over a year ago. The current form was developed this past summer. I started off with proof of the concept and then built it out with all the features from there.
Prior to that I made a tool called “Roxy,” for research proxy, that was the tool for my dissertation. In 2010 many sites and webpages didn’t use secure protocols http was now was not the case then. That system relied on that and as a tool it became increasingly less useful as the web change there was a need for something else. I thought about what other ways I could partner with users to gather digital traces from the web.
Trace partnership is the motivation behind Web Historian and other tools I am developing. Researchers need to partner with users to understand digital traces. Solely relying on companies to gather data means that often there is no consent between the two parties. Most people don’t read the user agreements or fully understand them and people don’t know how companies are going to use that data in the future. In a sufficient amount, anyone a program look at public tweets, see what mood you are in and sell data to someone who may want to employ you.
The way data is used isn’t understood fully even if you do read it. But people don’t want to know this. My intuition is that others need to know that *not* knowing doesn’t stop it from happening, which is the education component of the project.
Your browsing history, that is all over the web, has a corresponding record. This is very personal information you control in a certain way. Even if you delete it, it’s still all over the web. It is important that we know the data and what that says about us even if there are systems present that we might not like but need to be aware of.
What can individuals gain from the project?
People become more aware of the right to their own data and understand what they have. Researchers are put on the side of users with informed consent. Instead of gathering mass information, researchers partner with individuals by comparing that information with anonymous data. Individuals get to choose to be an informed partner in research.
Once a participant is knowledgeable about their traces, they can participate in research project about the data. An individual can then take a survey, be a part of any interview that researchers can combine with traces to give more insight in an existing method.
What are your future plans for this project and how is it influencing the development of other projects that are data driven?
There need to be more tools that allow people to have access to their own web usage but then we have to go beyond that. Although it is a difficult problem, I want to continue doing the work. Corporations and government have access to data that social scientists could only dream of. They know the behavior that we would like to know about.
The Right to Be Forgotten is a European law that removes from Google’s index, not just web. most people gated by what they see, what lenses are you able to view content online.
An everyday use of Web Historian for someone would reveal how they are using their time online, which websites control access to other websites. The networking visualization allows users to see the central points where they access other ones i.e. search engines, social media how you get to websites. This is important in many ways because if your online world is controlled by what index is used. For example, if you use Google a lot, it will show in visualization, it could mean you need to branch out and use other ways to find information.
Right now there are four types of visuals; they use less data but are more impactful. Wordcloud would allow people to have a good intuitive sense about the web visits, but I want to break down the spectrum to news sites so that it is possible to see the different types of news sites a person visits and cut down to domain and specific topics.
Watch Menchen Trevino discuss her research at American University.