HTML and SEO Basics

Semantic HTML

A major principle in web design is the use of HTML to indicate both what elements on a page are and how they appear.

Various HTML "tags" bracket your content in the source view of your web pages. The formatting and styling of each piece of content is managed by a system of opening and closing tags – i.e. <tag> content</tag> (along with a few "self-closing" tags – i.e. <tag />, such as for line break <br /> and horizontal rule <hr /> that do not bracket the content).

Common opening and closing HTML tags that you may be familiar with include:

  • <strong> for bold text </strong>
  • <em> for italicized text<em>

"Semantic" markup refers to those HTML tags that introduce additional meaning to the content of the web page. For example, the opening and closing <h1></h1> tags are used to indicate the main heading of a page, and <h2></h2> tags are used to indicate "second-level headings" or subheadings. Successive levels of subheadings (h3, h4, h5, h6) are nested as you would apply to an APA-styled thesis or other formatted documents that using a Table of Contents feature.

  • Resource – See LinkedIn Learning (log in with your Queen's Net ID) for courses on HTML, such as:

 HTML Essential Training
Host: Jen Simmons
2h 45m – Beginner level

 Practical HTML for Marketing Projects
Host: Jen Kramer
1h 44m – Beginner level

Search Engine Optimization (SEO)

Semantic HTML forms the basis of on-page search engine optimization (SEO) so that Google and other search engines can decipher what your page is about. For example, while adding a bold font style or a large font size might indicate content of importance to the reader, it does not signal importance to a search engine. Instead, he search engine "understands" the relative importance of an h1 tag compared to an h4 tag, or an image tag compared to a blockquote tag.

For best SEO, all headings should include important keywords that best describe the text that follows.

Note: In WebPublish, the mandatory "Title" field at the top of the Basic Page template converts to an <h1> rendered at the top of the page, so the <h1> option has been removed as an option in the WYSIWYG ("What You See Is What You Get") editor.
  • Resource – See LinkedIn Learning (log in with your Queen's Net ID) for courses on SEO, such as:

 SEO Foundations
Host: David Booth
3h 36m General

Accessibility

Semantic HTML tags also enable users of keyboard shortcuts and assistive technology ­– such as screen readers, alternate keyboards and mouse tools – to navigate online more efficiently. For example, it enables the user to quickly tab between headings and allows a screen-reader tool to voice (or read aloud) only the linked text on a page.

Learn more: Accessibility

Semantic structure versus styles and classes

Non-semantic html elements, such as <strong> <em>  <div> and <span> tell us little about content structure but are used widely to style text on a page.

The look (font type, size, colour, alignment, line spacing, etc.) of the content on your web pages is informed by a "style sheets." Within a style sheet, a web designer can define how content within a particular tag will display.

For example, headings 1 through 6 – <h1> <h2> <h3> <h4> <h5> <h6> – are usually set to display at increasingly smaller sizes. While the semantic tag provides clues to other machines about the nature of the content, the style sheet determines how that content looks on the page, making it easier for the user to skim or read.

illustration of heading sizes

When using semantic tags to convey meaning rather than for presentation purposes, be careful that you don't use them incorrectly or simply for their common display properties. For instance, use a table to display tabular data but not to help place text on a page. Conversely, simply bolding a line of text does not elevate it to the status of a heading.

The basic styling embedded in your website via CSS (cascading style sheets) should minimize the need for any additional inline styling of your page content, but where you do want some additional control over how your content renders across the site, the use of style sheets also enables designers to create custom "classes" for presentation. These classes can augment the presentation style for any HTML tag (i.e. to display an icon, to alter the text style, to render a background colour, etc.).

Also, with a very few changes in the style sheet, a designer can determine that all h1 headings will display a little larger, in red, with a serif font.

A web designer/developer can provide further examples of the built-in styles and classes that you can apply to your site content.

Some other common semantic tags that are familiar to editors include:

  • <p></p> general body text / paragraph text
  • <blockquote></blockquote> quotation
  • <a></a> link
  • <img /> image (a self-closing tag)
  • <ul> and <li> - nested opening and closing tags, used together to create unordered lists
  • <ol> and <li> - nested opening and closing tags, used together to create ordered lists
  • <abbr></abbr> - abbreviation and acronym, used to explain short forms, such as:
    Come to <abbr title="Queen’s University International Centre">QUIC</abbr>
  • <sub></sub> subscript, <sup></sup> superscript
  • <hr /> line or section break (a self-closing tag)
  • <form></form> form
  • <table> table, <tr> table row, <td> table cell <th> table header - nested opening and closing tags used to create tables, table rows and cells
     
  • Resource: W3 Schools

Clean code

Having clean code is also very important. Often, editors will copy text from another source and paste it straight into their site, bringing a lot of garbage code (or, at best, code belonging to another platform or design) along with it. This can not only impact impact how web pages render, but their SEO as well.

For more information and tools to ensure clean code in your site, see WYSIWYG Toolbar and Source View.

Non-breaking spaces and line breaks: &nbsp; and <br />

In the source code, the use of &nbsp; creates what is called a "non-breaking space." When you want to ensure that two words stay together no matter how the text normally wraps in the browser window, you can use &nbsp; between those words in the HTML code.

Source code example:

<p>Keep&nbsp;these&nbsp;words&nbsp;together.</p>

You might use the non-breaking space in dates and measurements, such as where you don't want a line break between "January" and "1st" or between "3" and "centimeters" in any view.

A non-breaking space code can also be used to create additional space between characters. For example, if you use your keyboard's space bar to create 6 spaces in your source code, the browser will render only one; inserting &nbsp; forces the browser to recognize the spacing you want. However, formatting beyond the use of a few non-breaking spaces in a row should be handled with padding and margin styles instead of non-breaking spaces.

Additionally, excessive non-breaking spaces, where they are not intentional (copying and pasting from a MS Word document can add them in the code - see "clean code" above), can create a strange mobile experience, where text wraps (or doesn’t wrap) at strange points. So, be sure to clear any unintended non-breaking spaces from your code. Also, be sure to look at your web pages on a small screen, such as a phone, to catch issues with how your page design is responding.

In contrast, the addition of the <br /> tag in your source code allows you to force a line break. In the WYSIWYG view, you can create these by inputting Shift + Return on your keyboard. Again, use these sparingly, as responsive design means that layouts should be (more or less) fluid.

URLs and file naming

URLs are a minor factor for search engine optimization (SEO) when keywords used reflect the topic for the page, but still worth paying attention to.

Human readable with keywords

  • Use "human-readable" URLS – i.e. a string of real words – not abbreviations or acronyms – that indicate the content of the page.

  • Use a hyphen to separate words. Do not use underscore, as search engines do not recognize them as separators. Do not use other special characters, as this may cause them to fail.

  • Ensure that every page URL includes a distinct keyword that best described the content of the page.

  • Use lower-case letters only. Upper-case letters can fail on some operating systems.

Organized structure

Revealing your site architecture via your URL path or folder structure can orient users and enhance usability, both on page and in search engine results. This means that child pages in a site should include the url path of its parent page. A structured URL helps both users and search engines understand where they are on your site. It also reveals what you have determined is the most important content.

Avoid creating multiple pages that fall outside your site's content hierarchy. There should be a small number of level 1 pages

Shorter is better

  • Short addresses with 2-3 unique and easy-to-understand keywords look attractive and are much easier to remember and enter manually in a browser.

  • While the maximum length of a URL is more than 2000 characters (i.e. a browser can process a long address), longer URLs have a negative impact on search engines. Aim for 50-75 characters as a maximum.

  • While WebPublish will automatically generate a URL for each page based on its page title, be aware that this can lead to superfluous and repeating words, and create URLS that are longer than necessary.

Putting it all together:

Human readable, keyword-rich, organized, short

Consider a set of three nested pages created at the top level of a website (i.e. the first is a parent of the second, and the second a parent of the third) where the subject matter is your dog:

  • The heading/title of the first page is Meet My Dog. Including the site domain (not the Queen's domain in this example), the automatically generated URL would be www.sitename.ca/meet-my-dog. This would be shown in the URL alias field in the editor area as "/meet-my-dog". For this page, shorten the automated URL path for this page to "/my-dog"

  • If the heading of the second page is Tricks My Dog Can Do, the auto-generated URL path for this page would be /tricks-my-dog-can-do. But you already have "my-dog" in the parent page URL, so shorten this to /tricks.

  • For the third page, titled Awards My Dog Has Won for Tricks, the auto-generated URL path would be awards-my-dog-has-won-tricks. (Note that the URL generator has dropped the conjunction "for"). Again, you already have some of these keywords in the inherited URL path, so you could shorten this to /awards.

In the end, you have shortened the URL considerably while maintaining a human-readable URL path that contains strong keywords that describe the content of each page along the way.

Before www.sitename.ca/meet-my-dog/tricks-my-dog-can-do/awards-my-dog-has-won-tricks 75 characters
After www.sitename.ca/my-dog/tricks/awards 36 characters

Linked documents

File names for linked documents should follow many of the same same principles as URLs in that they should be:

  • human readable
  • short
  • free of special characters
  • meaningful
  • free of blank spaces

Spaces in a file name translate to "%20" when they go online. Rewrite them using hyphens between readable words before loading them to the serve.

Consider that linked documents may end up on a user's desktop of in a downloads folder. Prepending a file name with the letters "QU", for example, might be helpful to indicate a Queen's document. Including date details or version numbers in the file name (as well as in the document itself, of course) can also be helpful. The Records Management and Privacy Office provides further guidance on document naming.

Learn more: Creating and Maintaining File Naming Standards

Documents and usability

Please note that linking to documents is not the best way of delivering content online, as the formatting of documents pose accessibility and usability issues.

Lean more: PDFs and Downloads

Image paths

Consider, as well, the naming of photo and illustration file names. Again, file names should be human readable, short, and free of special characters, and should include meaningful keywords.

Learn more: Photo and Video Standards (login required)