Computer Science Wiki - Web_Science | computersciencewiki.org Web_Science page | Excellent browsing site for web science topics |
WikiBooks web sceince | wikibook_IBCS_Web_Science | Information about CORE (C1-4) web science topics |
WAMP tutorial | WAMP tutorial | Windows, Apache, MySQL, PHP are often used in combination by web developers. |
The CS clasroom webscience video tutorial (2hrs20 | CS_Classroon_Web_Science_YouTube | In depth information about all web science topics |
What is a search engine | https://www.techtarget.com/whatis/definition/search-engine | search engine page on Techtarget definitions database. |
How does a Google search work | https://support.google.com/webmasters/answer/9128586?hl=en | The basics from google support page |
Web Application Architecture | https://stackify.com/web-application-architecture/ | Stackify guide to: How It Works, Trends, Best Practices and More |
Tutorchase | https://www.tutorchase.com/notes/ib/computer-science | Contains a good Web Science section - some direct links in the topic sections below... |
Students will be expected to have completed practical activities linked to developing different types of web
pages and be able to evaluate when a particular type of web page is most appropriate.
Click for C.1 Links page containing the information you need
The Internet is the interconnected network of networks and devices that uses IP addresses to identify them and low level network protocols (IP) to transfer data.
The world wide web is the network of resources held on the computers of the Internet. It is also the way of transferring the resources on the internet using high level protocols such as http.
Students will be expected to be aware of the major differences between the early forms of the web, Web 2.0, the semantic web and later developments.
Develop an appreciation of the possibilities and limitations associated with the evolution of the web.
Introduction to the semantic web [Cambridge semantics]
Item | Characteristic |
---|---|
hypertext transfer protocol (HTTP) | The underlying protocol used by the World Wide Web which defines how messages are formatted and transmitted using Client actions and Server responses. HTTP is a stateless application layer protocol. HTTP uses Request Methods (POST, GET, [PUT, DELETE]) for data handling. Text based protocol (not secure as unencrypted). |
hypertext transfer protocol secure (HTTPS) | HTTP that uses Transport Layer Security (TLS) protocol to provide authentication and encryption. Older technology uses secure socket layers (SSL) |
hypertext mark-up language (HTML) | HTML is the standard markup language for creating Web pages HP html guide |
uniform resource locator (URL) | The global address of documents and other resources on the World Wide Web consisting of protocol, domain name (or IP address, file |
extensible mark-up language (XML) | eXtensible Markup Language was designed to store and transport data and to be both human- and machine-readable. |
extensible stylesheet language transformations (XSLT) | Transformations which typically use XSL to transform XML documents into other formats (like HTML) |
JavaScript. (JS) | JS is the most used client side script. A programming language that can be interpreted by the browser. |
cascading style sheet (CSS) | CSS describes how HTML elements are to be displayed. External stylesheets are stored in CSS files and can control the layout of multiple web pages all at once |
URI uniform resource identifier
URL. uniform resource locator
URLs and URNs are special forms of URIs.
A URI that identifies a mechanism by which a resource may be accessed is usually referred to as a URL.
HTTP URIs are examples of URLs.
Some URI's are not URL's for example a URN s provides globally unique names for resources. If the URI has urn as its scheme and adheres to the requirements of RFC 2141 and RFC 2611, it is a URN. The ISBN of the book REST in Practice by J.Webber, S.Parastatidis, I.Robinson from which this paragraph is partly taken is ISBN-13: 978-0596805821. It identifies the book uniformly (same format as for other books) but the book could be located in more than one place so it is not a URL.
ref: [https://stackoverflow.com/questions/42534419/examples-of-uri-url-and-urn]
A way of uniquely locating a resource most commonly using http for example:
protocol://domain (or IP address)/path[?query#fragment]
http://hockerillct.com/16/CT/ib/web_science.html
The Domain Name System comprises many servers which maintain and distribute domain name and corresponding IP addresses. An http request will typically contain a domain name but what is needed for communication to take place is the IP address. When a new http request is made using a domain name the DNS will find a server which has the IP address of the domain and send this to the requesting computer so it can make the page request using the IP address of the machine where the domain is hosted.
Item | Characteristic |
---|---|
internet protocol (IP) | takes the network packets from the transport layer and sends them to the proper destinations based on their IP addresses |
transmission control protocol ( TCP) | creates and delivers the data packets passed on from the application layer to the appropriate host devices by adding source and destination port numbers and maintaining the end-to-end network connections. |
file transfer protocol (FTP) | When an FTP client requests to connect to an FTP server, a TCP connection is being established using the application layer within TCP and ports 20 and 21. FTP uses and relies on TCP to ensure all the packets of data are sent correctly and to the proper destination. |
information taken from: CEBERUS: How Does TCP/IP Relate to FTP? and cellbiol: The TCP/IP family of Internet protocols
To include features such as metatags, title, etc.
Simple html page ==> [w3schools how to write a website]
Protocols enable compatibility through a common "language" internationally.
This should include examples such as personal pages, blogs, search engine pages, wikis, forums.
To include analysis of static HTML web pages and dynamic web pages, eg PHP , ASP.NET , Java Servlets Ajax .
Best languages for server side programming (wpwebinfotech.com)
A bowser is an application (software) that allows users to access and view webpages. It can make http requests and renders the resulting html code including links and references to multimedia, styling, scripting etc.
Serverside first steps (mozilla.org)
Students will not be expected to write code (MySQL for example) to indicate how the connection is made, but should understand the principles of connecting to an underlying data source.
C2 Information page and links C2 Questions and tasks
TOK Data is always accessible.
Students will be expected to understand only the principles of the PageRank and HITS algorithms.
General principles of computational thinking, connecting computational thinking and program design.
PageRank (searchenginejournal.com)
Teachers should be aware of the range of terms that can be associated with web crawlers such as bots, web spiders, web robots.
Evolution of metatags and why they are not so important now (except the title?)
Students should be aware that this is not always a transitive relationship. If website A is realted to website B and website B is realted to website C does not mean than website C is related to website A.
TOK Data may not always have the intended meaning.
Web crawler parallezation policy (Wikipedia)
Distributed Web crawling (Wikipedia)
Students will be expected to test specific data in a range of search engines, for example examining time taken, number of hits, quality of returns.
An understanding of search engine metrics could lead to exploitation.
Students will be expected to understand that the ability of the search engine to produce the required results is based primarily on the assumptions used when developing the algorithms that underpin it.
LINK Connecting computational thinking and program design.
AIM 8 Developers of search engines should have a moral responsibility to produce an objective page ranking.
issues such as error management, lack of quality assurance of information uploaded.
AIM 9 Develop an appreciation that search engines will need to evolve to remain effective as the web grows.
• mobile computing
• ubiquitous computing
• peer-2-peer network
• grid computing.
LINK Networks.
Students should be aware of developments in mobile technology that have facilitated the growth of distributed networks.
INT Decentralization has increased international-mindedness.
Students will not be required to study the detailed compression algorithms.
Students can test different compression methods to evaluate their effectiveness.
Students should be aware of issues linked to the growth of new internet technologies such as Web 2.0 and how they have shaped interactions between different stakeholders of the web.
E, AIM 8 Emerging technologies are modifying users' behaviour.
C4 tutorchase notes (Part 1: Online Interaction and Social Networking)
Student should address the major differences only.
LINK Networks.
C4 tutorchase notes (Part 2: Cloud vs Traditional Client-Server)
To include public and private clouds.
AIM 8 Cloud computing could potentially conflict with privacy.
Students should investigate sites such as TurnItIn and Creative Commons. https://www.tutorchase.com/notes/ib/computer-science/c-4-3-intellectual-property-and-privacy
C4 tutorchase article (primary concerns of intellectual property)
C4 tutorchase notes (Part 3: Intellectual property and privacy)
C4 tutorchase notes (Part 4: Future web development)
AIM 9 Develop an appreciation that the future development of the web will have an effect on the rules and structures that support it.
INT, S/E, AIM 8 The web is creating new multinational online oligarchies.
C4 tutorchase article (impact of decentralization on web performance)
S/E, INT The web has changed users' behaviours and "removed" international boundaries.
Introduction document | Introduction to web science (a bit outdated now but worth a read) | PDF presentation of key ideas for C5 |
The vertices (nodes) represent web pages and the edges represent hyperlinks.
It is not a complete graph. The directed graph formed by the web is known as the web graph.
http://en.wikipedia.org/wiki/Directed_graph
http://mathinsight.org/network_introduction
A sub-graph will be assumed to be a set of pages linked to one specific topic.
Students must be aware the web has a structure that has emerged from the behaviour of web users.
LINK Mathematics: graph theory.
The eccentricity of a vertex is the greatest minimum distance between itself and any other vertex. It can be thought of as how far a node is from the node most distant from it in the graph.
The diameter of a graph is the maximum eccentricity of any vertex in the graph. That is the greatest distance between any pair of vertices i.e. find the shortest path between each pair of vertices then the greatest length of any of these paths is the diameter of the graph.
Students should be aware of the Page Rank algorithm and explain how it works. No calculations are required.
Cambridge power law and web graph lecture notes
Introduction document | Tutorchase: Introduction to semantic web foundations | HTML presentation of key ideas for C6 |
The semantic web is an abstract web concept – currently in development.
The aim is to have web pages organised and labelled in a better way to aid organisation and searching.
http://www.w3.org/standards/semanticweb/
"The Semantic Web provides a common framework that allows data to be shared and reused across application, enterprise, and community boundaries. It is a collaborative effort led by W3C with participation from a large number of researchers and industrial partners." [w3C]
Meta data added to web pages could make web pages easier to read and organise by computers.
To paraphrase Tim Berners-Lee, inventor of the World Wide Web, these tools will let the Web -- currently similar to a giant book -- become a giant database.
XML and RDF are the "official language" of the Semantic Web, but by themselves they're not enough to make the entire Web accessible to a computer.
http://computer.howstuffworks.com/semantic-web.htm
The traditional web is seen as being text based, the semantic web is multimedia based.
AIM 9 Develop an appreciation of the possibilities and limitations associated with the continuing evolution of the web.
Google bard answer to aims of sematic web
The core meaning of ontology within computer science is a model for describing the world that consists of a set of types, properties, and relationship types. An ontology is a vocabulary that describes objects and how they relate to one another
The Dublin Core was the first metadata standard for describing web content. The resources described using the Dublin Core may be digital resources (video, images, web pages, etc.) as well as physical resources such as books or works of art. Dublin Core metadata may be used for multiple purposes, from simple resource description to combining metadata vocabularies of different metadata standards, to providing interoperability for metadata vocabularies in the linked data cloud and Semantic Web implementations. Wikipedia - Dublin core
Web Ontology Language (OWL) - OWL, the most complex layer, formalizes ontologies, describes relationships between classes and uses logic to make deductions. It can also construct new classes based on existing information. OWL is available in three levels of complexity -- Lite, Description Language (DL) and Full.
http://en.wikipedia.org/wiki/Ontology_(information_science)
Folksonomy – social tagging
The tagging is done by users, often simultaneously.
The practice of generating electronic tags or keywords by users rather than specialists as a way to classify and describe online content.
two types: broad and narrow. A broad folksonomy is the one in which multiple users tag particular content with a variety of terms from a variety of vocabularies, thus creating a greater amount of metadata for that content. A narrow folksonomy, on the other hand, occurs when a few users, primarily the content creator, tag an object with a limited number of terms.
http://en.wikipedia.org/wiki/Folksonomy
AIM 8 Emerging technologies are modifying users' behaviour.
AIM 8 Emerging technologies are modifying users' behaviour.
Teachers must address issues relating to searching for non-text based files/multimedia files such as using feature analysis.
Students will be expected to have researched examples such as biometrics, nanotechnologies.
AIM 9 Develop an appreciation of the possibilities that ambient intelligence provides in supporting people when carrying out routine tasks.
Ambient Intelligence describes an environment which is sensitive and responsive to human presence.
http://en.wikipedia.org/wiki/Ambient_intelligence
Identification of human characteristics by computer systems.
(Fingerprint, eye details, voice recognition, facial recognition)
http://www.webopedia.com/TERM/B/biometrics.html
One nanometer is a billionth of a meter)(1 x 10 -9). This is the scale of nanotechnology.
Examples include
The manufacture of computer chips which has the potential to launch a new generation of electronic devices that run faster, while using less energy, than those made from silicon chips
Nanotechnology engineers build first carbon nanotube computer [nanowerk]
More recent: physicists-build-nanomaterial-microchip-using-graphene [nanomagazine]
Also: http://en.wikipedia.org/wiki/Industrial_applications_of_nanotechnology#Consumer_goods
MIT Center for Collective Intelligence
Students will be expected to have researched examples such as climate change, social bookmarking and stock market fluctuations.
The above are quite old links - DO your own research also.
AIM 5 Engender an awareness that effective collaboration and communication can resolve complex problems.
S/E, AIM 8 Emerging technologies are modifying users' behaviour.
TOK It is possible to have a collective intelligence greater than the sum of the contributors