Handbook of Information Security
Search Engines (security, privacy and ethical issues)
Raymond F. Wisman
ABSTRACT - Search engines open the front door of the Web, serving as the tool most often used when people look for any type of information. For online companies, a high rating by a search engine can bring in customers while a low rating can render the site invisible. Search engine attention is therefore commercially valuable and often subjected to attempts at manipulation. As a well-known and highly centralized part of the Web infrastructure, search engines present an attractive target to direct attacks by those seeking to disrupt.
The chapter gives an overview of search engine design while examining security and common attacks on search engines; privacy concerns of search engine users and about information made public; and ethical issues concerning how sites attempt to manipulate search engines and how the information is used. The literature reviewed includes sources on:
· text searching [Salton, 1971]
· current search engine design [Schwartz, 1998]
· security vulnerabilities [Massimo, 1997], [Microsoft, 2000]
· site positioning techniques [Marckini, 2001]
· common browser vulnerabilities [CERT® Advisory CA-2003-22, 2003]
Additional research is needed on privacy and ethics on the part of search engine users.
1-3 ETHICAL ISSUES
2 WHAT IS SEARCHED
2-1 Types of Search
2-2 Information Sources
3 HOW SEARCH WORKS
3-1 Human Organized Lists
3-2 Search Engines
3-3 What Search Engines Search
3-4 What Search Engines Ignore
4 SEARCH ENGINE SECURITY
4-1 External Search
4-2 Local Search
5 BROWSER SECURITY
5-1 Common Vulnerabilities
6-1 Personal Information
7-1 Search Engine Positioning
7-2 Paid placement
7-3 Personal Information
7-4 Intellectual property
Arasu, A., Cho, J., Garcia-Molina, H., Paepcke, A., and Raghavan, S. Searching the Web. ACM Transactions on Internet Technology, Vol.1, No. 1, August 2001, Pages 2-43.
Byers, S., Rubin, A., Kormann, D., Defending Against an Internet based Attack on the Physical World, WPES’02, November 21, 2002, Washington, DC, USA. Pages 11-18. Copyright 2002 ACM
Chakrabarti, S. Recent results in automatic Web resource discovery. ACM Computing Surveys, Vol. 31, Number 4es, December 1999
CERT® Advisory CA-2003-22 Multiple Vulnerabilities in Microsoft Internet Explorer
Original issue date:
Source: CERT/CC http://www.cert.org/advisories/CA-2003-22.html
CERT/CC Vulnerability Note VU#146704 ‘iWeb Systems Hyperseek search engine may allow malformed URL requests to access files outside the document root of a vulnerable system’ http://www.kb.cert.org/vuls/id/146704
Gordon, M., and Pathak, P. 1999. Finding information on the world wide web: the retrieval effectiveness of search engines. Information Processing and Management, 25(2):141-180, 1999.
Julio César Hernández, José María Sierra, Arturo Ribagorda, Benjamín Ramos. Search Engines as a Security Threat. IEEE Computer. October 2001 (Vol. 34, No. 10), pp. 25-30
Lawrence, S., and Giles, L. Accessibility of Information on the Web. Nature (400:107-109) 1999.
Marckini, F. Search Engine Positioning Wordware Publishing, 2001, ISBN 1-55622-804-X
Massimo Marchiori. Security of
World Wide Web search engines. IFIP TC5 WG5.4 3rd
international conference on Reliability, quality and safety of
Bulletin (MS00-078) Patch Available for 'Web Server Folder Traversal'
Vulnerability Originally posted:
Peltonen, K. ‘Adding Full Text Indexing to the Operating System’ Proceedings of the Thirteenth International Conference on Data Engineering; Pages: 386 - 390 Year of Publication: 1997
Robots Exclusion Protocol. 2000. Robots Exclusion Protocol. http://info.webcrawler.com/mak/projects/robots/exclusion.html.
Salton, G., and Buckley, C. Term-Weighting Approaches in Automatic Text Retrieval. Information Processing and Management, 24, 5.513-523. 1988.
Salton, G., Wong, A., and Yang, C. S. A Vector Space Model for Automatic Text Retrieval. Communications of the ACM, 18 11.613-620, 1971.
SANS Top 20 Internet Security Vulnerabilities. http://www.sans.org/top20
Schwartz, C., 1998. Web Search Engines. Journal of the American Society for Information Science, 49(11):973-982, 1998.
Search Engine Watch. http://www.searchenginewatch.com.