One of our customer has a solution that allows them to quickly create landing pages that are then used for SEM. Of course such pages are listed in the sitemap of the application domain. The lates addition to that combo was to list the videos embedded on the landing pages in the sitemap. It sounded hard, but turned out to be quite easy.

Our landing pages contain html that is saved with content editor. It is just html. The videos are embeded in a normal way recommended by the providers such as:

<iframe width="560" height="315" src="https://www.youtube.com/embed/BBnN5VLuxKw" frameborder="0" allowfullscreen></iframe>
<iframe src="//player.vimeo.com/video/2854412" width="500" height="311" frameborder="0" webkitallowfullscreen mozallowfullscreen allowfullscreen></iframe>

Get the list of embedded videos from the html of landing page

Nokogiri for the rescue :)

html = LandingPage.first.content

Nokogiri::HTML(html).xpath("//iframe").map do |iframe|
  extract_metadata( iframe[:src] )
end.compact

More about:

Get the metadata of videos based on their url

VideoInfo to the rescue :)

VideoMetadata = Struct.new(
  :title,
  :description,
  :thumbnail_location,
  :player_location,
  :duration_in_seconds,
  :publication_date,
)

def extract_metadata(url)
  player_location = url
  player_location = "http:#{url}" if URI(url).scheme.nil?

  vi = VideoInfo.new(url)
  VideoMetadata.new(
    vi.title,
    vi.description,
    vi.thumbnail_large,
    player_location,
    vi.duration,
    vi.date
  )
rescue VideoInfo::UrlError, *NetHttpTimeoutErrors.all
  return nil
end

If an iframe is not for a recognizable video then VideoInfo will raise an exception that we catch. If there is networking problem we gracefuly handle it as well.

Use metadata in the sitemap

SitemapGenerator to the rescue.

SitemapGenerator::Sitemap.create do
  LandingPage.find_each do |landing_page|
    videos = extracted(landing_page.content).map do |video_metadata|
      {
        title:            video_metadata.title,
        description:      video_metadata.description,
        thumbnail_loc:    video_metadata.thumbnail_location,
        player_loc:       video_metadata.player_location,
        duration:         video_metadata.duration_in_seconds,
        publication_date: video_metadata.publication_date
      }
    end

    add(
      landing_page_path(id: landing_page.slug),
      lastmod: landing_page.updated_at,
      changefreq: 'monthly',
      priority: 0.7,
      videos: videos
    )
  end
end

That’s it

These three snippets are the essence of it. There are of course tests, and there is adapter for obtaining video data so that tests don’t connect to the internet.

But it turned out to be way simpler than I expected. Which is always a nice surprise in our industry.