I exported my bookmarks in HTML format from FireFox and wanted to do some processing on them, but parsers like the HTML Agility Pack could not handle it properly because the format does not include DT (Url blocks) or DD (Description blocks) end tags, and that in turn results in massively nested XPath lists.

I think the proper solution is to deal with this using RegEx, but it was faster for me to code it manually; at the caveat that this depends on the formatting staying as I observed in the current FireFox (24.0). Running it on other types of HTML or exported bookmark files will surely break.

Note: The HTML export does not include tags; only Url, Title, FavIcon, and Description. So don't do this kind of thing if you use tags.

function Fix-FireFoxBookmarks {

    if (!(Test-Path $bookmarksFileName)) {

    $bookmarksHtml = Get-Content $bookmarksFileName

    # Don't process it if it looks like we may have already processed it
    if (!($bookmarksHtml -like "*</dt>*")) {
        $bookmarksHtmlRaw = $bookmarksHtml.Split([Environment]::NewLine)

        $bookmarksHtml = @()
        $readyToEndDT = $false
        $readyToEndDD = $false

        for ($i = 0; $i -lt $bookmarksHtmlRaw.Count; $i++) {
            $line = $bookmarksHtmlRaw[$i];

            if ($bookmarksHtmlRaw[$i] -like "*<dt>*") { # If we're starting a URL block
                $readyToEndDT = $true
            if ($bookmarksHtmlRaw[$i] -like "*<dd>*") { # If we're starting a Description block
                $readyToEndDT = $false
                $readyToEndDD = $true

            # If a URL block starts on the next line, or a Folder block starts on the next line
            if ($readyToEndDD -and ($bookmarksHtmlRaw[$i + 1] -like "*<dt>*" -or $bookmarksHtmlRaw[$i + 1] -like "*</dl>*")) {
                # Close both the Description and URL blocks
                $line += "</dd></dt>"

                $readyToEndDD = $false

            # If a URL block starts on the next line, and a Description block does not start on the next line
            if ($readyToEndDT -and $bookmarksHtmlRaw[$i] -like "*<dt>*" -and $bookmarksHtmlRaw[$i + 1] -notlike "*<dd>*") {
                # Close the URL block
                $line += "</dt>"

                $readyToEndDT = $false

            $bookmarksHtml += $line

        $bookmarksHtml = [String]::Join([Environment]::NewLine, $bookmarksHtml)
        Set-Content $bookmarksFileName $bookmarksHtml
        Write-Host "Rewrote Bookmarks File"