Converting HTML files into PDFs using Python

Your ads will be inserted here by

Easy Plugin for AdSense.

Please go to the plugin admin page to
Paste your ad code OR
Suppress this ad slot.

Once I was asked to read Swift docs and give a demo on the same. As you can see this docs is only HTML and no PDF version is available. It's itching to read HTMLs for a long time, and I personally prefer PDFs. Wanted to fetch all the doc pages and convert it to PDF. Here's how we can easily (read 'naive') do it using pdfkit.

P.S: swift_links.txt is my local file which has all the relevant links from the docs URL. How to extract links from a webpage? Different story ~ read here.

P.P.S: Have never checked about the copyright issue. Never transport the PDF unless you're quite sure it doesn't violate their policies.

Want to merge these files? Check it out here.

~Arunanand

Copying a List or Dict in Python: Is it as Simple as it looks?

Your ads will be inserted here by

Easy Plugin for AdSense.

Please go to the plugin admin page to
Paste your ad code OR
Suppress this ad slot.

Howdy!

How do we copy a list in python to another list? This might be quite deceiving for novices who have not done our basics right.

Let's have a look at the following sample code:

The output is:

But does it really work? Have all the elements 'really' copied to a new list? I mean, are they having some 'internal bond' with their respective elements in the parent list? Let's try printing the following:

Alas, it prints 'logging' though we changed it for the list li. That means, changes made in li also reflects in li_copy as well. This is how naive copying of lists and dict works in Python. We call it 'Shallow Copying". When we simply assign a list to another, or dict to another for that matter, Python just bonds the two with a reference. Changes made in one of them will affect the other one too!

How do we go about this problem? How to do 'really' copy, which we call as 'deep copy' without any internal references? Let's have a look.

Difference in References in Copy Schemes: Representational Figure TAAism Copy Python
Difference in References in Copy Schemes: Representational Figure

 

Your ads will be inserted here by

Easy Plugin for AdSense.

Please go to the plugin admin page to
Paste your ad code OR
Suppress this ad slot.

One logic is to copy each of the elements over a loop, to another list. When we do so, as given above, there would be no references made between the two lists. Changing something in one list would, obviously, not make changes in the other. The output of the above code is:


So we're done with both shallow copy and deep copy, writing our own codes. Is there a package in python which helps do this easily? Yes, we have a module named 'copy', importing which will help you do this in a better way. Have a look at the deep copy:

The output of the above code is:

The python docs has more of it in detail.

And now, what's the difference between copy and deepcopy methods of the copy module? Have a look here and find out yourself.

Well, these logics work for dict also. Try it out yourself!

By the way, did you look at the outputs of the three programs we wrote? Look at the time of execution and analyse it!

Do comment in, should you need to point out errata, or suggestions.

~Arunanand T A