Handbook of Information
Security
Search Engines (security, privacy and ethical issues)
Raymond F. Wisman
ABSTRACT
- Search engines open the front door of the Web, serving as the tool most often
used when people look for any type of information. For online companies, a high
rating by a search engine can bring in customers while a low rating can render
the site invisible. Search engine attention is therefore commercially valuable
and often subjected to attempts at manipulation. As a well-known and highly centralized
part of the Web infrastructure, search engines present an attractive target to
direct attacks by those seeking to disrupt.
The
chapter gives an overview of search engine design while examining security and common
attacks on search engines; privacy concerns of search engine users and about
information made public; and ethical issues concerning how sites attempt to
manipulate search engines and how the information is used. The literature
reviewed includes sources on:
·
text searching [Salton, 1971]
·
current search engine design [Schwartz, 1998]
·
security vulnerabilities
[Massimo, 1997], [Microsoft, 2000]
·
site positioning
techniques [Marckini, 2001]
·
common browser vulnerabilities [CERT® Advisory CA-2003-22, 2003]
Additional research is needed on privacy and ethics on
the part of search engine users.
OUTLINE
1
INTRODUCTION
1-1
SECURITY
1-2 PRIVACY
1-3 ETHICAL ISSUES
2 WHAT IS SEARCHED
2-1 Types of Search
2-2 Information Sources
3 HOW SEARCH WORKS
3-1 Human Organized Lists
3-2 Search Engines
3-3 What Search Engines
Search
3-4 What Search Engines
Ignore
4 SEARCH ENGINE SECURITY
4-1 External Search
4-2 Local Search
5 BROWSER SECURITY
5-1 Common Vulnerabilities
6 PRIVACY
6-1 Personal Information
6-2 Exploitation
7 ETHICS
7-1 Search Engine Positioning
7-2 Paid placement
7-3 Personal Information
7-4 Intellectual property
8 CONCLUSION
References
Arasu, A., Cho, J., Garcia-Molina, H., Paepcke, A., and Raghavan, S.
Searching the Web. ACM
Transactions on Internet Technology, Vol.1, No. 1, August 2001, Pages 2-43.
Byers, S., Rubin, A., Kormann, D., Defending Against an Internet based Attack on
the Physical World, WPES’02, November
21, 2002, Washington, DC, USA. Pages 11-18. Copyright 2002 ACM
Chakrabarti, S. Recent results in
automatic Web resource discovery. ACM Computing Surveys, Vol. 31, Number 4es,
December 1999
CERT® Advisory CA-2003-22
Multiple Vulnerabilities in Microsoft Internet Explorer
Original issue date:
Last revised:
Source: CERT/CC http://www.cert.org/advisories/CA-2003-22.html
CERT/CC Vulnerability Note
VU#612843 ‘Sun iPlanet and ONE Web Servers
contain a buffer overflow in the search engine’ http://www.kb.cert.org/vuls/id/612843
CERT/CC Vulnerability Note
VU#146704 ‘iWeb Systems Hyperseek
search engine may allow malformed URL requests to access files outside the
document root of a vulnerable system’ http://www.kb.cert.org/vuls/id/146704
CERT/CC
Understanding Malicious Content Mitigation For Web Developers http://www.cert.org/tech_tips/malicious_code_mitigation.html
Gordon, M., and Pathak, P. 1999. Finding information on the world wide web: the retrieval effectiveness of search
engines. Information Processing and Management, 25(2):141-180, 1999.
Julio César Hernández, José María Sierra, Arturo Ribagorda, Benjamín Ramos.
Search Engines as a Security Threat. IEEE Computer. October 2001 (Vol. 34, No.
10), pp. 25-30
Lawrence, S., and Giles, L. Accessibility of Information on the
Web. Nature (400:107-109) 1999.
Marckini, F. Search Engine Positioning Wordware Publishing, 2001, ISBN 1-55622-804-X
Massimo Marchiori. Security of
World Wide Web search engines. IFIP TC5 WG5.4 3rd
international conference on Reliability, quality and safety of
software-intensive systems.
Microsoft Security
Bulletin (MS00-078) Patch Available for 'Web Server Folder Traversal'
Vulnerability Originally posted:
Peltonen, K. ‘Adding Full Text Indexing to the Operating
System’ Proceedings of the Thirteenth
International Conference on Data Engineering; Pages: 386 - 390 Year of
Publication: 1997
Robots
Exclusion Protocol. 2000.
Robots Exclusion Protocol. http://info.webcrawler.com/mak/projects/robots/exclusion.html.
Salton, G., and Buckley, C. Term-Weighting Approaches in
Automatic Text Retrieval. Information Processing and Management, 24,
5.513-523. 1988.
Salton, G., Wong, A., and Yang, C. S. A Vector Space Model for Automatic Text Retrieval.
Communications of the ACM, 18 11.613-620, 1971.
SANS Top 20 Internet
Security Vulnerabilities. http://www.sans.org/top20
Schwartz, C., 1998. Web Search Engines. Journal of the American Society for
Information Science, 49(11):973-982, 1998.
Search Engine Watch. http://www.searchenginewatch.com.