Nothing Special   »   [go: up one dir, main page]

Unit 3 - Files

Download as pdf or txt
Download as pdf or txt
You are on page 1of 23

UNIT-III:

III.Files and Directories


3.1 File Concept
3.2 File types
3.3 File System Structure

3.4 File metadata


3.4.1 Inodes
3.4.2 kernel support for files
3.5 System calls for file I/O operations
3.5.1 open,create, read, write, close, lseek,dup2
3.6 file status information- stat family

3.7 file and record locking- fcntl function


3.8 file permission- chmod, fchmod
3.9 file ownership- chown, lchown,fchown
3.10 links- soft links & hard links-Symlink,Link,ulink
3.11Directories
3.11.1 creating- mkdir, removing - rmdir

3.11.2 changing directories -chdir


3.12 obtaining current working directory- getcwd
3.13 scanning directories - opendir,readdir,closedir,rewind dir functions
.
Unit III

Files and Directories


3.1 Working with Files

In this chapter we learn how to create, open, read, write, and close files.

UNIX File Structure

In UNIX, everything is a file.

Programs can use disk files, serial ports, printers and other devices in the exactly the same way
as they would use a file.

Directories, too, are special sorts of files.

3.2 File types

Most files on a UNIX system are regular files or directories, but there are additional types of
files:

1. Regular files: The most common type of file, which contains data of some form. There
is no distinction to the UNIX kernel whether this data is text or binary.
2. Directory file: A file contains the names of other files and pointers to information on
these files. Any process that has read permission for a directory file can read the contents
of the directory, but only the kernel can write to a directory file.
3. Character special file: A type of file used for certain types of devices on a system.
4. Block special file: A type of file typically used for disk devices. All devices on a
system are either character special files or block special files.
5. FIFO: A type of file used for interprocess communication between processes. It’s
sometimes called a named pipe.
6. Socket: A type of file used for network communication between processes. A socket
can also be used for nonnetwork communication between processes on a single host.
7. Symbolic link: A type of file that points to another file.

The argument to each of different file types is defined as follows_


Macro Type of file

S_ISREG() Regular file

S_ISDIR() Directory file

S_ISCHR() Character special file

S_ISBLK() Block special file

S_ISFIFO() Pipe or FIFO

S_ISLNK() Symbolic link

S_ISSOCK() Socket

3.3 File System Structure

Files are arranged in directories, which also contain subdirectories.

A user, neil, usually has his files stores in a 'home' directory, perhaps /home/neil.

Files and Devices


Even hardware devices are represented (mapped) by files in UNIX. For example, as root, you
mount a CD-ROM drive as a file,

$ mount -t iso9660 /dev/hdc /mnt/cd_rom


$ cd /mnt/cd_rom
/dev/console - this device represents the system console.
/dev/tty - This special file is an alias (logical device) for controlling terminal (keyboard and
screen, or window) of a process.
/dev/null - This is the null device. All output written to this device is discarded.

3.4 File Metadata

3.4.1 Inodes

• A structure that is maintained in a separate area of the hard disk.


• File attributes are stored in the inode.
• Every file is associated with a table called the inode.
• The inode is accessed by the inode number.
• Inode contains the following attributes of a file:
file type, file permissions , no. of links

UID of the owner, GID of the group owner, file size


date and time of last modification, last access, change.

File attributes
Attribute value meaning
File type type of the file
Access permission file access permission for owner, group and others

Hard link count no.of hard links of a file.


UID file owner user ID.
GID the file group ID.
File size file size in bytes.
Inode number system inode number of the file.
File system ID file system ID where the file is stored.

3.4.2 Kernel Support For Files:


UNIX supports the sharing of open files between different processes. Kernel has three data
structures are used and the relationship among them determines the effect one process has on
another with regard to file sharing.

1. Every process has an entry in the process table. Within each process table entry is a table
of open file descriptors, which is taken as a vector, with one entry per descriptor.
Associated with each file descriptor are
a. The file descriptor flags.
b. A pointer to a file table entry.
2. The kernel maintains a file table for all open files. Each file table entry contains
a. The file status flags for the file(read, write, append, sync, nonblocking, etc.),
b. The current file offset,
c. A pointer to the v-node table entry for the file.
3. Each open file (or device) has a v-node structure. The v-node contains information about
the type of file and pointers to functions that operate on the file. For most files the v-
node also contains the i-node for the file. This information is read from disk when the
file is opened, so that all the pertinent information about the file is readily available.
The arrangement of these three tables for a single process that has two different files open
one file is open on standard input (file descriptor 0) and the other is open standard output
(file descriptor 1).

Here, the first process has the file open descriptor 3 and the second process has file open
descriptor 4. Each process that opens the file gets its own file table entry, but only a single v-
node table entry. One reason each process gets its own file table entry is so that each process has
its own current offset for the file.

 After each ‘write’ is complete, the current file offset in the file table entry is incremented
by the number of bytes written. If this causes the current file offset to exceed the current
file size, the current file size, in the i-node table the entry is to the current file offset(Ex:
file is extended).
 If a file is opened with O_APPEND flag, a corresponding flag is set in the file status flags
of the file table entry. Each time a ‘write’ is performed for a file with this append flag
set, the current file offset in the file table entry is first set to the current file size from the
i-node table entry. This forces every ‘write’ to be appended to the current end of file.
 The ‘lseek’ function only modifies the current offset in the file table entry. No I/O table
place.
 If a file is positioned to its current end of file using lseek, all that happens is the current
file offset in the file table entry is set to the current file size from the i-node table entry.
It is possible for more than a descriptor entry to point to the same file table only. The file
descriptor flag is linked with a single descriptor in a single process, while file status flags are
descriptors in any process that point to given file table entry.

3.5 System Calls and Device Drivers

System calls are provided by UNIX to access and control files and devices.

A number of device drivers are part of the kernel.

The system calls to access the device drivers include:

Library Functions

To provide a higher level interface to device and disk files, UNIIX provides a number of
standard libraries.
Low-level File Access

Each running program, called a process, has associated with it a number of file descriptors.

When a program starts, it usually has three of these descriptors already opened. These are:

The write system call arranges for the first nbytes bytes from buf to be written to the file
associated with the file descriptor fildes.

With this knowledge, let's write our first program, simple_write.c:


Here is how to run the program and its output.

$ simple_write
Here is some data
$

read

The read system call reads up to nbytes of data from the file associated with the file
decriptor fildes and places them in the data area buf.

This program, simple_read.c, copies the first 128 bytes of the standard input to the standard
output.

If you run the program, you should see:


$ echo hello there | simple_read
hello there
$ simple_read < draft1.txt
Files

open

To create a new file descriptor we need to use the open system call.

open establishes an access path to a file or device.

The name of the file or device to be opened is passed as a parameter, path, and
the oflags parameter is used to specify actions to be taken on opening the file.

The oflags are specified as a bitwise OR of a mandatory file access mode and other optional
modes. The open call must specify one of the following file access modes:

The call may also include a combination (bitwise OR) of the following optional modes in
the oflags parameter:
Initial Permissions

When we create a file using the O_CREAT flag with open, we must use the three parameter
form. mode, the third parameter, is made form a bitwise OR of the flags defined in the header
file sys/stat.h. These are:

For example

Has the effect of creating a file called myfile, with read permission for the owner and execute
permission for others, and only those permissions.

umask

The umask is a system variable that encodes a mask for file permissions to be used when a file is
created.

You can change the variable by executing the umask command to supply a new value.

The value is a three-digit octal value. Each digit is the results of ANDing values from 1, 2, or 4.
For example, to block 'group' write and execute, and 'other' write, the umask would be:

Values for each digit are ANDed together; so digit 2 will have 2 & 1, giving 3. The
resulting umask is 032.

close
We use close to terminate the association between a file descriptor, fildes, and its file.

ioctl

ioctl is a bit of a rag-bag of things. It provides an interface for controlling the behavior of
devices, their descriptors and configuring underlying services.

ioctl performs the function indicated by cmd on the object referenced by the descriptor fildes.

Try It Out - A File Copy Program

We now know enough about the open, read and write system calls to write a low-level
program, copy_system.c, to copy one file to another, character by character.

Running the program will give the following:


We used the UNIX time facility to measure how long the program takes to run. It took 2 and one
half minutes to copy the 1Mb file.

We can improve by copying in larger blocks. Here is the improved copy_block.c program.

Now try the program, first removing the old output file:
The revised program took under two seconds to do the copy.

Other System Calls for Managing Files

Here are some system calls that operate on these low-level file descriptors.

lseek

The lseek system call sets the read/write pointer of a file descriptor, fildes. You use it to set
where in the file the next read or write will occur.

The offset parameter is used to specify the position and the whence parameter specifies how the
offset is used.

whence can be one of the following:

dup and dup2

The dup system calls provide a way of duplicating a file descriptor, giving two or more, different
descriptors that access the same file.

3.6 File Status Information-Stat Family: fstat, stat and lstat


The fstat system call returns status information about the file associated with an open file
descriptor.

The members of the structure, stat, may vary between UNIX systems, but will include:

The permissions flags are the same as for the open system call above. File-type flags include:
Other mode flags include:

Masks to interpret the st_mode flags include:

There are some macros defined to help with determining file types. These include:

To test that a file doesn't represent a directory and has execute permisson set for the owner and
no other permissions, we can use the test:
3.7 File and record locking-fcntl function

• File locking is applicable only for regular files.


• It allows a process to impose a lock on a file so that other processes can not modify the
file until it is unlocked by the process.
• Write lock: it prevents other processes from setting any overlapping read / write locks on
the locked region of a file.
• Read lock: it prevents other processes from setting any overlapping write locks on the
locked region of a file.

• Write lock is also called a exclusive lock and read lock is also called a shared lock.
• fcntl API can be used to impose read or write locks on either a segment or an entire file.
• Function prototype:

#include<fcntl.h>

int fcntl (int fdesc, int cmd_flag, ….);

• All file locks set by a process will be unlocked when the process terminates.

3.8 File Permission-chmod

You can change the permissions on a file or directory using the chmod system call. Tis forms the
basis of the chmod shell program.

3.9 chown
A superuser can change the owner of a file using the chown system call.

3.10 Links-soft link and hard link

Soft link(symbolic links):Refer to a symbolic path indicating the abstract location of another
file.
 Used to provide alternative means of referencing files.
 Users may create links for files using ln command by specifying –s option.
hard links : Refer to the specific location of physical data.
 A hard link is a UNIX path name for a file.
 Most of the files have only one hard link. However users may create additional hard links for
files using ln command.
Limitations:
 Users cannot create hard links for directories unless they have super user privileges.
 Users cannot create hard links on a file system that references files on a different systems.

3.10.1 unlink, link, symlink

We can remove a file using unlink.

The unlink system call decrements the link count on a file.

The link system call cretes a new link to an existing file.

The symlink creates a symbolic link to an existing file.

3.11 Directories

As well as its contents, a file has a name and 'administrative information', i.e. the file's
creation/modification date and its permissions.
The permissions are stored in the inode, which also contains the length of the file and where on
the disc it's stored.

A directory is a file that holds the inodes and names of other files.

3.11.1 mkdir, rmdir

We can create and remove directories using the mkdir and rmdir system calls.

The mkdir system call makes a new directory with path as its name.

The rmdir system call removes an empty directory.

3.11.2 chdir

A program can naviagate directories using the chdir system call.

3.12 Current Working Directory- getcwd

A program can determine its current working directory by calling the getcwd library function.

The getcwd function writes the name of the current directory into the given buffer, buf.
3.13 Scanning Directories

The directory functions are declared in a header file, dirent.h. They use a structure, DIR, as a
basis for directory manipulation.

Here are these functions:

3.13.1 opendir

The opendir function opens a directory and establishes a directory stream.

3.13.2 readdir

The readdir function returns a pointer to a structure detailing the next directory entry in the
directory stream dirp.

The dirent structure containing directory entry details included the following entries:
telldir

The telldir function returns a value that records the current position in a directory stream.

seekdir

The seekdir function sets the directory entry pointer in the directory stream given by dirp.

3.13.3 closedir

The closedir function closes a directory stream and frees up the resources associated with it.

Try It Out - A Directory Scanning Program

1. The printdir, prints out the current directory. It will recurse for subdirectories.
2. Now we move onto the main function:
The program produces output like this (edited for brevity):

How It Works

After some initial error checking, using opendir, to see that the directory exists, printdir makes
a call to chdir to the directory specified. While the entries returned by readdir aren't null, the
program checks to see whether the entry is a directory. If it isn't, it prints the file entry with
indentation depth.

You might also like