
Reading and Writing Opened Files
Reading and Writing Opened Files ź“ė Ø


Once youāve opened up a file, youāll want to read or write to the file. First off, letās cover reading a file. There are multiple methods that can be called on a file object to help you out:
Method | What It Does |
---|---|
.read(size=-1) | This reads from the file based on the number of size bytes. If no argument is passed or None or -1 is passed, then the entire file is read. |
.readline(size=-1) | This reads at most size number of characters from the line. This continues to the end of the line and then wraps back around. If no argument is passed or None or -1 is passed, then the entire line (or rest of the line) is read. |
.readlines() | This reads the remaining lines from the file object and returns them as a list. |
Using the same dog_breeds.txt
file you used above, letās go through some examples of how to use these methods. Hereās an example of how to open and read the entire file using .read()
:
with open('dog_breeds.txt', 'r') as reader:
# Read & print the entire file
print(reader.read())
#
# Pug
# Jack Russell Terrier
# English Springer Spaniel
# German Shepherd
# Staffordshire Bull Terrier
# Cavalier King Charles Spaniel
# Golden Retriever
# West Highland White Terrier
# Boxer
# Border Terrier
Hereās an example of how to read 5 bytes of a line each time using the Python .readline()
method:
with open('dog_breeds.txt', 'r') as reader:
# Read & print the first 5 characters of the line 5 times
print(reader.readline(5))
# Notice that line is greater than the 5 chars and continues
# down the line, reading 5 chars each time until the end of the
# line and then "wraps" around
print(reader.readline(5))
print(reader.readline(5))
print(reader.readline(5))
print(reader.readline(5))
#
# Pug
#
# Jack
# Russe
# ll Te
# rrier
Hereās an example of how to read the entire file as a list using the Python .readlines()
method:
f = open('dog_breeds.txt')
f.readlines() # Returns a list object
#
# ['Pug\n', 'Jack Russell Terrier\n', 'English Springer Spaniel\n', 'German Shepherd\n', 'Staffordshire Bull Terrier\n', 'Cavalier King Charles Spaniel\n', 'Golden Retriever\n', 'West Highland White Terrier\n', 'Boxer\n', 'Border Terrier\n']
The above example can also be done by using list()
to create a list out of the file object:
f = open('dog_breeds.txt')
list(f)
#
# ['Pug\n', 'Jack Russell Terrier\n', 'English Springer Spaniel\n', 'German Shepherd\n', 'Staffordshire Bull Terrier\n', 'Cavalier King Charles Spaniel\n', 'Golden Retriever\n', 'West Highland White Terrier\n', 'Boxer\n', 'Border Terrier\n']
Iterating Over Each Line in the File
A common thing to do while reading a file is to iterate over each line. Hereās an example of how to use the Python .readline()
method to perform that iteration:
with open('dog_breeds.txt', 'r') as reader:
# Read and print the entire file line by line
line = reader.readline()
while line != '': # The EOF char is an empty string
print(line, end='')
line = reader.readline()
#
# Pug
# Jack Russell Terrier
# English Springer Spaniel
# German Shepherd
# Staffordshire Bull Terrier
# Cavalier King Charles Spaniel
# Golden Retriever
# West Highland White Terrier
# Boxer
# Border Terrier
Another way you could iterate over each line in the file is to use the Python .readlines()
method of the file object. Remember, .readlines()
returns a list where each element in the list represents a line in the file:
with open('dog_breeds.txt', 'r') as reader:
for line in reader.readlines():
print(line, end='')
#
# Pug
# Jack Russell Terrier
# English Springer Spaniel
# German Shepherd
# Staffordshire Bull Terrier
# Cavalier King Charles Spaniel
# Golden Retriever
# West Highland White Terrier
# Boxer
# Border Terrier
However, the above examples can be further simplified by iterating over the file object itself:
with open('dog_breeds.txt', 'r') as reader:
# Read and print the entire file line by line
for line in reader:
print(line, end='')
#
# Pug
# Jack Russell Terrier
# English Springer Spaniel
# German Shepherd
# Staffordshire Bull Terrier
# Cavalier King Charles Spaniel
# Golden Retriever
# West Highland White Terrier
# Boxer
# Border Terrier
This final approach is more Pythonic and can be quicker and more memory efficient. Therefore, it is suggested you use this instead.
Note
Some of the above examples contain print('some text', end='')
. The end=''
is to prevent Python from adding an additional newline to the text that is being printed and only print what is being read from the file.
Now letās dive into writing files. As with reading files, file objects have multiple methods that are useful for writing to a file:
Method | What It Does |
---|---|
.write(string) | This writes the string to the file. |
.writelines(seq) | This writes the sequence to the file. No line endings are appended to each sequence item. Itās up to you to add the appropriate line ending(s). |
Hereās a quick example of using .write()
and .writelines()
:
with open('dog_breeds.txt', 'r') as reader:
# Note: readlines doesn't trim the line endings
dog_breeds = reader.readlines()
with open('dog_breeds_reversed.txt', 'w') as writer:
# Alternatively you could use
# writer.writelines(reversed(dog_breeds))
# Write the dog breeds to the file in reversed order
for breed in reversed(dog_breeds):
writer.write(breed)
Working With Bytes
Sometimes, you may need to work with files using byte strings. This is done by adding the 'b'
character to the mode
argument. All of the same methods for the file object apply. However, each of the methods expect and return a bytes
object instead:
with open('dog_breeds.txt', 'rb') as reader:
print(reader.readline())
#
# b'Pug\n'
Opening a text file using the b
flag isnāt that interesting. Letās say we have this cute picture of a Jack Russell Terrier (jack_russell.png
):

You can actually open that file in Python and examine the contents! Since the .png
file format is well defined, the header of the file is 8 bytes broken up like this:
Value | Interpretation |
---|---|
0x89 | A āmagicā number to indicate that this is the start of a PNG |
0x50 0x4E 0x47 | PNG in ASCII |
0x0D 0x0A | A DOS style line ending \r\n |
0x1A | A DOS style EOF character |
0x0A | A Unix style line ending \n |
Sure enough, when you open the file and read these bytes individually, you can see that this is indeed a .png
header file:
with open('jack_russell.png', 'rb') as byte_reader:
print(byte_reader.read(1))
print(byte_reader.read(3))
print(byte_reader.read(2))
print(byte_reader.read(1))
print(byte_reader.read(1))
#
# b'\x89'
# b'PNG'
# b'\r\n'
# b'\x1a'
# b'\n'
A Full Example: dos2unix.py
Letās bring this whole thing home and look at a full example of how to read and write to a file. The following is a dos2unix
like tool that will convert a file that contains line endings of \r\n
to \n
.
This tool is broken up into three major sections. The first is str2unix()
, which converts a string from \r\n
line endings to \n
. The second is dos2unix()
, which converts a string that contains \r\n
characters into \n
. dos2unix()
calls str2unix()
internally. Finally, thereās the __main__
block, which is called only when the file is executed as a script. Think of it as the main
function found in other programming languages.
"""
A simple script and library to convert files or strings from dos like
line endings with Unix like line endings.
"""
import argparse
import os
def str2unix(input_str: str) -> str:
r"""
Converts the string from \r\n line endings to \n
Parameters
----------
input_str
The string whose line endings will be converted
Returns
-------
The converted string
"""
r_str = input_str.replace('\r\n', '\n')
return r_str
def dos2unix(source_file: str, dest_file: str):
"""
Converts a file that contains Dos like line endings into Unix like
Parameters
----------
source_file
The path to the source file to be converted
dest_file
The path to the converted file for output
"""
# NOTE: Could add file existence checking and file overwriting
# protection
with open(source_file, 'r') as reader:
dos_content = reader.read()
unix_content = str2unix(dos_content)
with open(dest_file, 'w') as writer:
writer.write(unix_content)
if __name__ == "__main__":
# Create our Argument parser and set its description
parser = argparse.ArgumentParser(
description="Script that converts a DOS like file to an Unix like file",
)
# Add the arguments:
# - source_file: the source file we want to convert
# - dest_file: the destination where the output should go
# Note: the use of the argument type of argparse.FileType could
# streamline some things
parser.add_argument(
'source_file',
help='The location of the source '
)
parser.add_argument(
'--dest_file',
help='Location of dest file (default: source_file appended with `_unix`',
default=None
)
# Parse the args (argparse automatically grabs the values from
# sys.argv)
args = parser.parse_args()
s_file = args.source_file
d_file = args.dest_file
# If the destination file wasn't passed, then assume we want to
# create a new file based on the old one
if d_file is None:
file_path, file_extension = os.path.splitext(s_file)
d_file = f'{file_path}_unix{file_extension}'
dos2unix(s_file, d_file)