Data File Handling
Data File Handling
Data File Handling
• Algorithm and
1. The algorithm takes care of the rules and procedures required for solving the problem and the data
structures contain the data.
2. The data is manipulated by the procedures for achieving the goals of the program. A data structure is
volatile by nature in the sense that its contents are lost as soon as the execution of the program is over.
If we want to permanently store our data or want to create persistent objects then it becomes
necessary to store the same in a special data structure called file.
The file can be stored on a second storage media such as hard disk. In fact, vary large data is always
stored in a file.
“A file is a logical collection of records where each record consists of a number of items known as
fields”.
The records in a file can be arranged in the following three ways:
1. Ascending/Descending order: The records in the file can be arranged according to ascending or
descending order of a key field.
2. Alphabetical order: If the key field is of alphabetic type then the records are arranged in alphabetical
order.
3. Chronological order: In this type of order, the records are stored in the order of their occurrence i.e.
arranged according to dates or events. If the key-field is a date, i.e., date of birth, date of joining,
etc. then this type of arrangement is used.
HENCE, we can say Files are used to store data in a storage device permanently and File handling
provides a mechanism to store the output of a program in a file and to perform various operations
on it.
Python too supports file handling and allows users to handle files i.e., to read and write files, along with many
other file handling options, to operate on files. The concept of file handling has stretched over various other
languages, but the implementation is either complicated or lengthy, but alike other concepts of Python, this
concept here is also easy and short.
DATA FILE OPERATIONS:
Python file handling takes place in the following order:
1. Opening a file
2. Performing operations
3. Closing the file
1. Text File
2. Binary File
A text file consists of a sequence of lines. A line is a sequence of characters, stored on permanent
storage media. By default each line is terminated by a special character, known as EOL (\n) . At the
lowest level text file is a collection of bytes. They are stored in human readable form and can be created
using any text editor.
A binary file, on the other hand, is used to store binary data such as images, video files, audio files etc. It
contains arbitrary binary data, usually numbers stored in the file, which can be used for numerical
operations. So, when we work with binary file, we have to interpret the raw bit pattern(s) read from the
file into correct type of data in our program. In binary file there is no delimiter. Also, no character
translations can be carried out in a binary file. As a result, binary files are easier and much faster than
text files for carrying out reading and writing operations on data.
In case of binary file, it is extremely important that we interpret the correct data type while reading
the file. Python provides special modules for encoding and decoding of data for binary file.
STEPS IN FILE HANDLING:
1. OPEN A FILE: When something is to be read or written in a file, the first step to open a file. Opening a
file means creating connection between the program and the file. When the file is opened, the
operating system is asked to find the file name given with the statement to open the file and make sure
the file exists. File object is created by using open() or file() function. The syntax is
Access mode tells the interpreter how the file will be sued throughout the program, (reading, writing or
append)
Second parameter is an optional parameter as the default mode is the read mode.
read(r) : to read the file
write(w) : to write to the file
append(a) : to write at then end of the file
Following is the table to use different access modes used with open() :
Note :
1. While creating a file for append and write, the file must be created in the same default folder where
python is installed. Complete path is given in case file is to be created somewhere else.
2. While reading , the given file must exist in the folder otherwise python will raise FileNotFounfError.
Once the file opening is successful and file object gets created , various details can be retrieved by using the
properties:
a. name - name of the file opened .
b. mode - mode in which the file was opened
c. closed – returns Boolean value which indicates whether file was closed or not.
d. readable – Boolean value which indicates whether the file is readable or not.
Output is :
3. CLOSE A FILE : The close() method of a file object flushes any unwritten information and closes the file
object. Python automatically closes a file when the reference object of a file is reassigned to another file.
It is a good practice to use the close() method to close a file .The syntax is
Fileobject.close()
IT basically breaks the link of file object and the file on the disk.
TEXT FILE OPERATIONS:
1. Writing in a file: The data which is inputted by user , can be redirected to text file (or binary file )
using writing methods. Writing in text file (character data) is done by two methods:
A. write (string) : This method takes a string as a parameter and writes it in a file . Since each line in
text file is terminated by a special character, known as EOL (\n), we will have to add \n character
to the end of the string. To store numeric value, it needs to be converted to string.
B. writelines (string) : To write a single line in a file , write() method is used but to write multiple
lines(list, tuples) into a file , writelines() method is used. This method also does not add any EOL
character, programmer has to use \n after the end of each list item or strings.
IT can be summarized as :
• First to create a new file you need to use the built-in function open(). The open function accepts the file
name and mode w as the parameters. You can pass the relative or absolute path to the open() function.
If you pass a relative path, the file in the current working directory is processed. Mode w indicates that
the file is created if it does not exist otherwise the file content will be erased. The open() function
returns a file object to allow to have further manipulation.
• Second to write text into a text file you method write() of the file object. You can write a single line or
multiple lines of texts as a string into a text file. To write a new line to a file you need explicitly add end-
of-line character \n to string.
• Third, you must close the file to finalize file contents and free up system resources by using
method close() of the file object.
Ex 1
s1="This is python\nwe are indian "
file=open("t1.txt","w")
file.write(s1)
file.close()
Ex 2
s1="This is python.we are indian "
file=open("C:\\Users\\..\\Desktop\\t2.txt","w")
file.write(s1)
file.close()
Ex3
l1=["python","C++","Java"]
file=open("C:\\Users\\..\\Desktop\\t2.txt","w")
file.write(l1)
file.close()
l1=["python\n","C++\n","Java\n"]
file=open("C:\\Users\\..\\Desktop\\t2.txt","w")
for i in l1:
file.write(i)
file.close()
but with writelines() method the entire list can be written on one go.
l1=["python","C++","Java"]
file=open("C:\\Users\\..\\Desktop\\t2.txt","w")
file.writelines(l1)
file.close()
or more specifically
l1=["python\n","C++\n","Java\n"]
file=open("C:\\Users\\..\\Desktop\\t2.txt","w")
file.writelines(l1)
file.close()
There are three ways to read data from a text file.
1. read(): Returns the read bytes in form of a string. Reads n bytes, if no n specified, reads
the entire file.
File_object.read([n])
2. readline (): Reads a line of the file and returns in form of a string. Line is terminated by \n
and it is also read from the file and post – fixed in the string. For specified n, reads at
most n bytes. However, does not reads more than one line, even if n exceeds the length
of the line. However all lines (one by one line) is to be read from the file, loop can be
used.
File_object.readline([n])
3. readlines () : Reads all the lines and return them as each line a string element in a list.
File_object.readlines()
file=open("C:\\Users\\Gokul Pareek\\Desktop\\t2.txt","r")
data=file.read()
print(data)
file.close()
file=open("C:\\Users\\Gokul Pareek\\Desktop\\t2.txt","r")
data=file.read(6)
print(data)
file.close()
file=open("C:\\Users\\Gokul Pareek\\Desktop\\t2.txt","r")
data=file.readline()
print(data)
file.close()
file=open("C:\\Users\\Gokul Pareek\\Desktop\\t2.txt","r")
data=file.readline()
print(data)
file.close()
file=open("C:\\Users\\Gokul Pareek\\Desktop\\t2.txt","r")
data=file.readline()
for i in data:
print(data,end='')
data=file.readline()
file.close()
file=open("C:\\Users\\Gokul Pareek\\Desktop\\t2.txt","r")
data=file.readlines()
print(data)
file.close()
file=open("C:\\Users\\Gokul Pareek\\Desktop\\t2.txt","r")
data=file.readlines()
for i in data:
print(i,end='')
file.close()
One needs to be careful while using entire contents from the file because of the size of memory
required
Q Write a program to display the longest line of the file t3.txt (assume file is created)
file=open("C:\\Users\\Gokul Pareek\\Desktop\\t2.txt","r")
data=file.readlines()
maxi=""
for i in data:
if (len(i)>len(maxi)):
maxi=i
print(maxi,end='')
file.close()
Q Write a program to write multiple lines of text to file t3.txt, keep asking user if they want to add more to file.
file=open("C:\\Users\\Gokul Pareek\\Desktop\\t2.txt","a")
ch='y'
while (ch=='y'):
data=input("enter a line")
file.write(data)
ch=input("do you want to add more to file ....")
if(ch!='y'):
break
file.close()
Q Write a function to take input of lines in a1.txt, close the file and display those lines which are starting with ‘A’
file=open("C:\\Users\\Gokul Pareek\\Desktop\\t3.txt","r")
L1=f1.readlines()
for i in L1:
if(i[0]==’A’)
print(i)
f1.close()
Q Write a program to read the text file data.txt and count the number of times “my” occurred in the file.
file=open("C:\\Users\\Gokul Pareek\\Desktop\\t3.txt","w")
s1="thfm dsfsdf sdfdsf sdfdsf dsgdsgdsf my fdsf dfds dsfds my asdsad asdsads asfsadff"
file.write(s1)
file.close()
file=open("C:\\Users\\Gokul Pareek\\Desktop\\t3.txt","r")
word=file.read()
x=word.split()
count=0
for i in x:
if ( i=="my"):
count+=1
print(count)
file.close()
OR
f1.close()
Q Write a program to read from a text file p.txt and display those words which are less than 5 characters.
file=open("C:\\Users\\Gokul Pareek\\Desktop\\t2.txt","r")
word=file.read()
x=word.split()
count=0
for i in x:
if ( len(i)<5):
count+=1
print(count)
file.close()
Q A text file “Quotes.Txt” has the following data written in it:
Living a life you can be proud of
Doing your best
Spending your time with people and activities that are important to you
Standing up for things that are right even when it’s hard
Becoming the best version of you
Write a user defined function to display the total number of words present in the file.
1. file handling is used when data is to be written(output) to the file or read (input ) from the file.
2. File handling is done using file object.
3. It can be opened in read , write and append mode.
4. To open , the parameters are file name and mode of the file (in which it is to be opened)
5. If file is open in write mode, it the file doesn’t exist , it will be created, if exists , will be overwritten
6. If file is open in read mode, if the file doesn’t exist , will give an error , else the file pointer will be at the
beginning of the file.
7. To retain the previous contents in the file (more to write), must be opened in append mode.
8. The default mode is read mode.
9. File should be closed after the operations.
10. The text file is in users readable format and have an extension of txt.
1. Writing in a file :
a. To write the entire string use, write () with string
b. To write lines in a file , use writelines() with list of strings
In both the cases, the data will be written in a same line separated by comma.to have in different lines , EOL (\n)
should be used.
a. To read the entire data in a string use read(), read(n) can be used to read n bytes.
b. To read the specific line, readline() is used , readline(n) will read part of the line only when n is small
than the size of the line
c. To read all the lines , use readlines() .It creates a list of string.
Note
A. When words are to be compared, use read() , split() and compare inside the loop.(read() reads in a string
and then split() splits the words to be compared.
B. When specific line is to read, use readline()
C. When something related to searched as lines (lines starting with A) use readlines(), it will fetch the data
in list of strings , where each line can be compared.
1. Write a Python program to read an entire text file.
3. Write a Python program to append text to a file and display the text.
Binary file can have custom file formats and the developer, who designs these custom file
formats, converts the information, to be stored, in bits and arranges these bits in binary file so
that they are well understood by the supporting application and when needed, can easily be
read by the supporting application.
One most common example of binary file is image file is .PNG or .JPG. If one tries open these
files using a text editor then, he/she may get unrecognizable characters, but when opened
using the supporting image viewer, the file will be shown as a single image. This is because
the file is in binary format and contains data in the form of sequence of bytes. When the text
editor tries to read these bytes and tries to convert bits into characters, they get undesired
special characters and display it to the user.
If we wish to write a structure such as list or dictionary to a file and read it subsequently, the
methods of text files don’t work as text file can have only text format data.
Pickling and unpickling:
Python pickle module is used for serializing and de-serializing a Python object structure. Any
object in Python can be pickled so that it can be saved on disk. What pickle does is that it
“serializes” the object first before writing it to file. Pickling is a way to convert a python object
(list, dict, etc.) into a character stream. The idea is that this character stream contains all the
information necessary to reconstruct the object in another python script.
import pickle
Emp= {1:"Zack",2:"53050",3:"IT",4:"38",5:"Flipkart"}
Article:
You just ran through a time-consuming process to load a bunch of data into a python object.
Maybe you scraped data from thousands of websites. Maybe you computed a zillion digits of
pi. If your laptop battery dies or if python crashes, your information will be lost.
Pickling allows you to save a python object as a binary file on your hard drive. After you pickle
your object, you can kill your python session, reboot your computer if you want, and later load
your object into python again.
You could back up your pickle file to Google Drive or DropBox or a plain old USB stick if you
wanted. You could email it to a friend.
Article :
Suppose you just spent a better part of your afternoon working in Python, processing many data
sources to build an elaborate, highly structured data object. Say it is a dictionary of English words
with their frequency counts, translation into other languages, etc. And now it's time to close your
Python program and go eat dinner. Obviously, you want to save this object for future use, but
how?
You *could* write the data object out to a text file, but that's not optimal. Once written as a text
file, it is a simple text file, meaning next time you read it in you will have parse the text and
process it back to your original data structure.
What you want, then, is a way to save your Python data object as itself, so that next time you
need it you can simply load it up and get your original object back. Pickling and unpickling let you
do that. A Python data object can be "pickled" as itself, which then can be directly loaded
("unpickled") as such at a later point; the process is also known as "object serialization".
import pickle
f = open('gradesdict.pkl', 'wb')
pickle.dump(grades, f)
f.close()
import pickle
f = open('gradesdict.pkl', 'rb') #
'rb' for reading binary file
mydict = pickle.load(f)
f.close()
print(mydict)
# prints {'Lisa': 98, 'Bart': 75,
'Milhouse': 80, 'Nelson': 65}
1,Abhimanyu-Science