Python File Management - Chapter 3

Chapter 3


File Management

After understanding the basics of Python, it’s time to learn how to work with files within Python. When you are working on code, you are basically creating something new, and you want to make sure that Python is storing this data in a way that will help you to gain access to it again later on. When it is saved inside of Python, you also want to be able to have that information display inside of the program when it is time. Whenever you want to store programs, you will create a file, but there are times when you know you’re going to want to reuse this block of code over again within a program. This chapter is going to take some time to address different files that you can work with Python. When you create a new file in Python, you must always save it, similar to the way that you save a file in Word or in Excel. If you don’t save it, all your hard work will be lost. When you write a new file in IDLE, open a new window by going to File>New File. TO save the file, go to File>Save As. It is best practice to save your files with the .py extension, just as you would save a Word file as .doc. Some of the things that we are going to explore in this chapter include: 

  • Creating a new file
  • Moving a file
  • Editing a file with some more code inside
  • Closing a file.

Working with Files in Python

There is quite a bit to learn about working with files in Python but, as a beginner, this is absolutely the
first place to start. If you don’t know how to create a file, write to a file, open, save and move files,
then your Python experience is not going to be an easy one. So, let’s start with creating and writing
text to files and the most basic program that all beginners start with – Hello, World!
To create a new file, you first need to input some text, so try this:
print(‘hello world’)
Python is an object-oriented language, which means that the code is constructed around objects. These
objects contain data and several methods that are needed to access the data and alter it. Once an
object has been created, it will then be able to interact with any of the other objects in the code.
In the basic example above, we see just one object type, a string that says, ‘hello world’. This string
is nothing more than a character sequence that is encased in quote marks.
Strings can be written in three ways:
message1 = 'hello world'
message2 = "hello world"
message3 = """hello
hello
hello world"""
As I mentioned to you earlier, you cannot mix single and double quotes in the same string. For
example, this is wrong:
message1 = "hello world'
message2 = 'hello world"
message3 = 'I don’t like pickles'
Look at how many single quotes are in message 3. If that is going to work, the apostrophe in the word
“don’t” must be escaped so it isn’t counted as a single quote:
message3 = 'I can\'t eat pickles'
Or you can rewrite it using double quotes to enclose the string:

message3 = "I don’t like pickles"
In the final example above, we see triple quotes – hello world”””
These indicate that a string goes over more than a single line.
The print command will print an object in textual form and, when you combine it with a string, you get
a statement. We use print in this way when we want to create information that must be acted on
straight away. Sometimes though, you will want to create information that is to be saved, sent to
another place or used by another program for more processing. In situations like this, the information
needs to be sent to a file on your hard drive instead of to the “command output” pane. Input this
program in the text editor and then save it with the name file-output.py.
# file-output.py
f = open('helloworld.txt','w')
f.write('hello world')
f.close()
Where you see a hash (#) at the start of a line, this is a comment and the interpreter will ignore it.
Recall, comments are used to make notes for yourself or for others who might be reading your code.
In this particular program, f is the file object and the file methods are open, close, and write. What
that means is, those methods each do something to that object, in this case, we have defined it as a .txt
file. You will come to know this as a method – a piece of code that performs a specific action.
f is the name we have given to a variable – you can name it whatever you want, provided you use the
naming rules – lower and uppercase letters and underscores but no special characters and definitely
no reserved keywords. If you attempted to name your variable “print”, for example, your program
would fail because print is a reserved keyword. Don’t forget, variable names are also case sensitive
and that means the following three would all be different variables:
  • FOOBAR
  • Foobar
  • foobar

Back to the file; when you run this, the “open” method will inform your computer that a new file
needs to be created. The file is to be called helloword.txt, it will be a text file and it is to be saved in
the same folder that you saved file-output.py. We use the parameter “w” to indicate that we are going to use Python to write new content to the file.
NOTE – because the parameter and the file name are both enclosed in single quotes, they have both
been stored as strings. If you omit these quote marks, your program will fail.
Your program writes a message to the file on the next line, a string of characters that reads “hello
world” and then closes that file.
In your editor, execute the program. You won’t see anything in the Command Output pane but you will
see a message that will read something along these lines if you use Mac or Linux:
`/usr/bin/python file-output.py` returned 0.
Or like this in Windows:
'C:\Python27\Python.exe file-output.py' returned 0.
What this message is telling you is that your program was successful in executing. Open the file by
selecting File>Open>File and selecting your file, you should see the message:
Hello World!
Text files do not contain a whole lot of formatting information and, as such, they are small and can
easily be exchanged between platforms – Windows to Mac, or to Linux, or the other way around –
and they can be read by those using different text editors to the one the file was written in.
Reading from Text Files
Python also contains a number of methods that allow you to retrieve information from a file. Input the
following program into your editor and save it with the name file-input.py. When you execute it by
clicking on Run, the text file you just created will open, the message will be read from it and the
message will be printed to the command output pane:
# file-input.py
f = open('helloworld.txt','r')
message = f.read()
print(message)
f.close()
Here, we have used the parameter “r” to indicate that we want to open and read from a file.
Parameters allow you to choose from all the different options offered by the method. Let’s say that
you train your dog to bark for a treat – once for beef, twice for chicken. The flavor of that snack is the
parameter in our code. Each method differs in what parameters will be accepted.
Read is a file method. The file contents, in this case, one line of text, will be copied to “message”, the
name we have given to the string, and print will send this to the command output pane.

Appending to a Text File That Already Exists

You can also open a file that has already been created and add in some more. Be aware – if you open
any file and then use the “w”, or write method, the contents of the file will be over-written so take
care what you are doing. This isn’t a problem when you create new files, or when you want the
contents over-written but it can cause huge problems when you want to compile large data sets into
one single file or create event logs. Instead, we use the append method, or “a”.
Input this program into the editor and then save it with the name file-append.py. Now, when this
program is run, the helloworld.txt file you created will open and a second “hello world” will be
appended to the file. ‘\n’ indicates a new line:
# file-append.py
f = open('helloworld.txt','a')
f.write('\n' + 'hello world')
f.close()
When this has been run, open the text file called helloworld.txt. What do you see? Close it, run
append.py another two or three times and then open the helloworld.txt file again. You should see that
the hello world message is repeated as many times as you ran the file.
Moving Files
In Python, moving a file is actually renaming it and it is incredibly easy. This is down to a useful
module named shutil, a module that contains a function called move. That function does exactly what
it says on the tin – it will move a file or a directory from one place to another. Have a look at this
simple example:
import shutil
def move(src, dest):

shutil.move(src, dest)
See? Dead simple. The move function will take the source directory r file and move it to the new
directory or file:
shutil.copy vs os.rename
If the directory or file is located on the current local file system, shutil.move will use os.rename to
move it. Otherwise, shutil.copy2 is used to copy the directory or file to the new locations and will
then delete the source.
So, why do we use shutil.move instead of using os.rename? The answer to that is above – shutil.move
looks after any cases where files are not located on current local file systems and copies directories
to new destinations. If there are any exceptions thrown up by os.rename, shutil.move will handle them
correctly so you don’t need to spend time worrying about them.
shutil.move will throw up its own exceptions, shutil.Error and this happens when the destination
directory or file is already in existence or if you are trying to copy the source directory or file into or
onto itself.
That is how simple it is to move a file. The only thing you need to note here is that, if the file is on the
current file system, the call to move function will be instant, whereas if you are moving the file to
another destination or drive, it will take a little longer.

Working with Binary Files

What you know as a file is not quite what a file is in Python. For example, in Word, a file is any item
that is created, edited or manipulated by the user, such as images, executables, text documents, and so
on. These folders tend to be organized into folders so they are easy to find again.
In Python, files come under just two categories – text and binary – the difference between this is very
important.
Text files are sequences of lines and each line will contain a sequence of characters – this is known
as syntax or code as you already know by now. Each of the lines of code ends with the EOL character
or End of Line. There are several of these but the ones used most often are the comma or a newline,
which tell the interpreter that a new line is starting.
A binary file, on the other hand, is a file that isn’t a text file. Binary files can’t be processed by just
any application; it must be an application that knows the structure of the file and, more importantly,
understands it. In layman’s terms, binary files can only be processed by those applications that know

how to read and to interpret binary.

Writing to a Binary File

It is very simple to write to a binary file in Python and one way is to open the file in binary mode and
then write the data into the file as strings of hexadecimal characters.
Have a look at this example:
output_file = open("myfile.bin","wb")
output_file.write(b"\x0a\x1b\x2c")
output_file.write(b"\x3d\x4e\x5f")
output_file.close()
To read the contents of the binary file generated, in Linux we use the hex.dump command:
hexdump -C myfile.bin
The -C option also tells hex.dump to show the file contents in hexadecimal form and as an ASCII
string.
The output of that command would be:
00000000 0a 1b 2c 3d 4e 5f |..,=N_|
00000006
The dot that you see on the right-hand side is representative of a byte that hexdump cannot interpret as
an ASCII character or a byte that contains the ASCII code for the dot character.
There is one problem with this; this is not an easy way to go when we want several objects written
into a binary file. For example, what if we wanted to add strings, integer values or a list into the file?
How could we possibly read these values when we wanted to? The long way around would be to
write metadata that specified the structure of each addition and this is incredibly advanced stuff – not
something you need to be worrying about at this stage. Luckily, there is an effortless way because
Python has one module that will do all this work for us. It’s a module called pickle and it lets us
convert objects to bitstreams, which we can then store in files and use to construct the original object
later. Pickle cannot do this for all data types but it can do it for most of what you will use in Python.
I expect this all sounds a little on the complicated side so let’s look at a few examples which will
show you just how easy pickle makes things.

To write an object straight to a binary file (for write, we use the word dump), input this command:

import pickle
output_file = open("myfile.bin", "wb")
myint = 42
mystring = "Hello, world!"
mylist = ["spoon", "fork", "knife"]
mydict = { "name": "Simon", "job": "Doctor" }
pickle.dump(myint, output_file)
pickle.dump(mystring, output_file)
pickle.dump(mylist, output_file)
pickle.dump(mydict, output_file)
output_file.close()
What you see generated in the binary file will be different depending on whether you are using Python
2 or Python 3 and the reason for that is because the way pickle works has changed as time has gone
by. The recommendation is to use Python 3 when you use pickle so there are no compatibility issues.
We can load or retrieve the original objects from the file called myfile.bin, in the exact same order
they were dumped into it:
import pickle
input_file = open("myfile.bin", "rb")
myint = pickle.load(input_file)
mystring = pickle.load(input_file)
mylist = pickle.load(input_file)
mydict = pickle.load(input_file)
print("myint = %s" % myint)
print("mystring = %s" % mystring)
print("mylist = %s" % mylist)

print("mydict = %s" % mydict)
input_file.close()
The program output shows you that the original objects have been retrieved correctly from within the
binary file:
myint = 42
mystring = Hello, world!
mylist = ['spoon', 'fork', 'knife']
mydict = {'job': 'Doctor', 'name': 'Simon'}

That completes this overview of working with text and binary files.

2 Comments

Previous Post Next Post