Download from other Wiki Page

Last modified by Vincent Massol on 2021/03/18 11:28

cogA script to create an XWiki page with the pictures fetched from the web-page and the syntax provided. Meant to be adjusted.
TypeSnippet
CategoryOther
Developed by

Paul Libbrecht

Rating
0 Votes
LicenseApache License 2.0

Description

This tool is meant to convert a wiki page fetched from another wiki by creating the appropriate document and attaching the pictures it found in the document. It has been successfully used with dozens of pages in a confluence (3.0) to import in an XWiki 7.1.

Fist make sure you customized the source for your needs (wiki syntax id, particular mappings of names and of source, maybe other adjustments such as picture detections regexp, special methods for the URL to be fetched by groovy, or temporary directory), then run the tool for each page that you try to import.  Now you can use the tool to massively import the pages.

The wiki-page that you save with the source below will display a form with:

  • document URL
  • source (in some wiki syntax)
    The document will be created and all found images will be added as attachment. The web displays the work done and links to the newly created wiki page so you can check it out. Other hand adjustments (e.g. parents) can be performed then.

Prerequisites & Installation Instructions

Save a wiki page with the source below as content (e.g. in Tools.DownloadFromOtherWikiPage) being an admin.

// {{groovy}}
import com.xpn.xwiki.api.*;
import org.apache.commons.io.FileUtils;
import java.util.regex.*;

// Comment out when programming... IDEs catch these and give you autocompletion
// Document doc = null;
// Context xcontext = null;

// case specific: search and replace for the wiki page content
String adjustSource(String source) {
   return source.replaceAll("\\{jsmath\\}(((?!jsmath).)*)\\{jsmath\\}", "{html}\\\\\\\\(\$1\\\\\\\\){html}").
            replaceAll("\\{sympresurl\\}(.*)\\{sympresurl\\}", "[See encodings|http://devdemo.activemath.org/mathbridge/tools/symbolpresentation.cmd?\$1]").
            replaceAll("\\{section\\}","").
            replaceAll("\\|bibliography", "|Bibliography").
            replaceAll("\\{column[^}]*\\}\n?","");
}

// case specific: mapping between URL and XWiki page name
String computeDocname(URL url) {
   return "Census." + url.getPath().replaceAll(".*/","");
}

// case specific: syntax id
String syntaxId = "confluence/1.0";


if(request.url) {
    URL pageURL = new URL(request.url);
    name = computeDocname(pageURL);
    doc = xwiki.getDocument(name);
    println("Saving to document [[${doc}]].")
    print("{{html clean=false wiki=false}}")
    println("Filename ${name}.html");
    File t = new File("/tmp/${name}.html");
    FileUtils.copyURLToFile(pageURL, t);
    String html = FileUtils.readFileToString(t, "utf-8");
    Pattern pattern = Pattern.compile("<img src=\\\"(/download/attachments/[^\\\"]*)");
    Matcher m = pattern.matcher(html);
    println("<ul>")
   while(m.find()) {
        URL u = new URL(pageURL, m.group(1));
        String imageName = u.getPath().replaceAll(".*/","");
        println("<li><a href='${u}'>${imageName }</a>");
        File tmp = new File("/tmp/xxx.png");
        FileUtils.copyURLToFile(u, tmp);

       // now fetch and store
       byte[] content = FileUtils.readFileToByteArray(tmp);
       //println("Attachment is of size ${content.length} and is named ${name}");
       Attachment attachment = doc.addAttachment(imageName , content);
        println(" has become <a href='${doc.getAttachmentURL(imageName )}'>${imageName }</a> (rev ${attachment.getVersion()}).</li>");
   }
    println("</ul>");
    String source = request.source;
    source = adjustSource(source);
    doc.setContent(source);
    doc.setTitle();
    doc.setSyntaxId(syntaxId);
    doc.save("Fetching from ${pageURL}.");
    print("{{/html}}");
} else {
    println("Please provide a doc and url parameter.");
    println("{{html clean=false wiki=false}}");
    println("<form action='${doc.name}' method='POST'>")
    println("url: <input name='url' type='text'/><br/>")
    println("source: <textarea name='source'></textarea>")
    println("url: <input name='submit' type='submit'/></form>")
}


//  {{/groovy}}
     

Get Connected