Friday, October 12, 2012

Fixing another /var/lib/dpkg/status corruption

This has happened a few times on my debian/ubuntu/apt based system, sending me searching for answers. I always end up hacking the file to make it work, which makes me a bit uneasy, but it is in fact apparently the only choice in many of these situations. This "database" is just a plain text file.

Sometimes the problem is obvious, but often it is not. For example, here is my latest error message:
dpkg-query: error: parsing file '/var/lib/dpkg/status' near line 7848 package 'fontconfig': field name `
There is no ` character anywhere near line 7848, field name or otherwise.

The easiest solution to this, and similar, issues is to remove the whole section belonging to the package named. So in this case I remove everything starting at the line containing
Package: fontconfig
down to, and including, the next blank line.

But wow apt/dpkg no longer no longer knows the status of your fontconfig package. The easiest way to solve that is just re-install it. (sudo apt-get install fontconfig)

Make a backup of the status file first, if you're nervous.

Friday, July 13, 2012

Inside a Sony PRS-T1 Reader

Having had passed to me a broken PRS-T1, I decided to rip it apart. Below is what it looks like, for the curious. Note the battery (which has been disconnected in the pictures) is soldered to the board, so not very easily replacable/reusable.

Here is the PRS-T1 service manual with instructions on how to open the thing up. Basically it has (pretty tight) clips along the inside sides.

(Right click and save, or open in a new tab/window, if you want full resolution to zoom in on.)

Sunday, January 29, 2012

Join / Merge / Combine / Concatenate PDF pages into a single PDF

There are many free (and open source) methods available.

pdfjoin / pdfjam - command line utility. Very easy to use, but has some limitations:
  • Output pages are all the same size
  • Hyperlinks are stripped
Use ghostscript to "print" all of the files into a new PDF file. With this method, you can use other ghostscript capabilities to further modify the output (such s resampling graphics, etc):
  •  gs -dBATCH -dNOPAUSE -q -sDEVICE=pdfwrite -sOutputFile=output.pdf input1.pdf input2.pdf
pdftk (PDF Toolkit) - yet more command line tools for PDF manipulation. There are a bunch of different styles of command line (examples can be seen on web site). The simplest is:
  • pdftk input1.pdf input2.pdf input3.pdf cat output output.pdf
  • powerful but complicated GUI to PDFtk is available 
pyPDF - a tool for the programming / python oriented. It can split, merge, watermark, rotate, and extract information with simple python scripting. There is a GTK-based GUI available called PDF-Shuffler.

PDF Split and Merge is a cross-platform java based GUI (and command line) tool.

There are more! With so many choices, how does one choose? Whim, personal preferences, scale requirements, tools and platform on hand (ease of installation/integration), or just random.

Tuesday, January 17, 2012

Resampling jpegs inside a PDF (to shrink/reduce file size)

My wife is applying for various teaching jobs. Most of the positions require uploading documents to various school board applicant tracking systems. I carefully prepared the documents for her, also wanting to keep them for historical archival purposes. The uploads produce a bizarre error. Contacting the site administrator, we find out that there is a 1 megabyte file size limit on uploads and this is what produces the cryptic error. Sigh. (To add insult to injury, after resampling the scanned PDFs, it then turns out there's also crazy filename limitations... so all the nice descriptively  named files had to be renamed to eliminate various characters, and maintain a certain length). These are the gatekeepers of our children's education.

But I digress.

This info is a lot of places on the net already, but I want to add it here for my own reference as I've forgotten the simple command line a few times already and had to re-search it. 

In short, use ghostscript:
gs -sDEVICE=pdfwrite -dPDFSETTINGS=/ebook -dNOPAUSE -dQUIET -dBATCH -sOutputFile=output.pdf input.pdf
The PDFSETTINGS presets equate to the following resolutions:

  • /screen ... screen-view-only quality, 72 dpi images 
  • /ebook ... low quality, 150 dpi images
  • /printer ... high quality, 300 dpi images 
  • /prepress ... high quality, color preserving, 300 dpi images
  • /default ... similar to /screen

Other suggested command line arguments made various places:

  • -dCompatibilityLevel=1.4 
  • -dColorImageResolution=38 -dColorImageDownsampleType=/Average -dGrayImageDownsampleType=/Average -dGrayImageResolution=38 -dMonoImageResolution=38 -dMonoImageDownsampleType=/Average -dOptimize=true -dDownsampleColorImages=true -dDownsampleGrayImages=true -dDownsampleMonoImages=true -dUseCIEColor -dColorConversionStrategy=/sRGB
  • -dMaxSubsetPct=100
This page is quite good for further tips (such as recompressing with lossless compression).