I am just starting out on my first PostNuke site.
The site will be on a technical and commercial subject. As such the main thrust of content will be University papers and company whitepapers. I receive such papers in PDF format.
Assuming that PDF format is not a good choice for Post Nuke, then my only option is to convert the PDFs to HTML.
"Advanced PDF to HTML Converter" does batch jobs and produces an HTML file that looks the same as the PDF, which is somewhat impressive. BUT it produces large files, with a lot of HTML codes per piece of real text (content). I fear that Google may not like HTML pages that have loads of HTML per piece of real text.
The other thought is I could create somekind of template and manually copy the text in the PDF to the pre-made HTML template. This takes up time but would give a homogenic look and I imagine be more search engine friendly.
So first question, what do others think about my ideas of handling a site where content is given to me in PDF format?
Second question is I hear of 3rd party tools for dealing with content formatting, notibly PagEd and ContentExpress. Can others out there let me know how these programs actually benefit you and what modules in PostNuke they replace if any? I have looked at both the PagEd and ContentExpress sites but their benefit remains encrypted. TKS
