Skip to content

Working with Files

Last updated on June 30th, 2024 at 03:20 pm

Table of Contents

Standard File Operations

Ruby can natively read and write many different file types. Common to all file types is the need to open, read, write and close.

Opening

A file is opened in Ruby using the open method, which in its simplest form just takes a file name and attempts to open the file for reading.

The open method can take parameters and options, however. For example, if you want to open a file for read only, then you would add the mode parameter ‘r’.

You can also open the file for writing by using ‘w’ on the mode parameter.

The mode parameters available to you are:

ModePurpose
‘r’Read only, starts reading at the beginning of the file
‘r+’Read-write, starts at the beginning of the file
‘w’Write only. Truncates the file if it already exists. Creates a new file if one does not exist.
‘w+’Read-write. Truncates the file if it already exists. Creates a new file if one does not exist.
‘a’Write-only. Each write call appends data to end of file. Creates a new file if one does not exist.
‘a+’Read-write. Each write call appends data to end of file. Creates a new file if one does not exist.
Additional Modes – must be used in accompaniment with previous modes
‘b’Binary file mode. Will not use EOL conversion. Sets external encoding to ASCII-8BIT by default
‘t’Text file mode.
File Modes when opening

For example, let’s assume we want to open a file write-only and binary.

Reading

If you want to open a file and read in one statement, then the read method is just what you’re looking for. It will open, read and close the file in one statement.

The structure of the data in the assigned variable is dependent upon the data being read. If you are reading simple ASCII text, then the data will be a string separated by newlines (\n) .

If your application requires that you separate the open, read and close operations, you can break these into their separate methods.

If your application requires that you read the file one line at a time, then you can use the readline method instead of the readlines method.

One quick note – if you are looking for a summary of commands available to operate on Files, you not only have the File class to look into, but also the IO class because File inherits from IO.

Writing

Writing to a file is just as simple as reading from it, with the difference that the file has to be opened with the write option.

CSV Files

CSV is a very commonly used file format and support for it is included in the standard Ruby library (you will, however, have to insert the statement require 'csv' prior to using the CSV class) and it operates very much the same as the File class.

Another, more convenient way of reading in a CSV file would be to use the foreach method.

In the example above, if you have an error processing the CSV file, you can use row_index to tell the user what row the problem occurred on. Because the iteration uses .with_index(1), which tells the system to use 1 as the starting row number, we don’t have to use row_index + 1 when telling the user which row the problem occurred on. Opening the file with the headers: true option means that each row will be read in as a hash with the column header as the key and the column value as the value, which allows you to use the column header in your error message as well.

You may also notice that using the foreach iterator, the file is automatically closed after the file read is completed.

Digging in CSV files

In the Ruby Basics section, we covered using the dig method for arrays, hashes and structs. You can also use the dig command with CSV files. There are two options for the dig method within CSV data – digging tables and digging rows.

To dig a table, you would supply both the row number and the column name:

To dig in a row, you simply supply the column name:

Similar to arrays and hashes, if you dig for an element that is not found, dig will return a nil instead of an error.

YAML Files

Ruby also supports reading of YAML files as part of the standard library. As with the CSV support, you will need to include the YAML support with a require statement. Because YAML files are not line oriented, you will open, read everything in / write everything out, and close the file in a single method.

To read YAML files, you will need to use the load_file method.

Writing to a YAML file is a bit more complex. First, you need to dump an object to a YAML object and then write the object to a file.

Alternatively, you can use to_yaml instead of the dump method.

JSON Files

JSON files are similar toYAML files, in that they contain structured data and cannot be read in one line at a time. Instead, you will need to read the contents in and use JSON to parse the contents of the file. The parsed contents are then a typical hash.

Normally, when creating a JSON file, it is formatted so that it is easier to read in a text editor. To pretty format output to JSON file, you will need to use the method pretty_generate.

File Utilities

Below is a table containing an exhaustive list of the various file utilities available in Ruby. You will see that there is a great deal of duplication between various classes and some methods are just calling a method of the same name for a different class.

Method NameMethod Purpose
File Class Methods
Creation and Opening
newOpens the file at the given path according to the given mode; creates and returns a new File object for that file.
openCreates a new File object, via File.new with the given arguments.
Reading and Writing
readReads the full contents of the file.
readlinesReads and returns all remaining line from the stream; does not modify.
writeWrites each of the given objects to self, which must be opened for writing; returns the total number bytes written; each of objects that is not a string is converted via method to_s:
binreadBehaves like IO.read, except that the stream is opened in binary mode with ASCII-8BIT encoding.
binwriteBehaves like IO.write, except that the stream is opened in binary mode with ASCII-8BIT encoding.
foreachCalls the block with each successive line read from the stream.
File Information
exist?Return true if the named file exists.
file?Returns true if the named file exists and is a regular file.
directory?With string object given, returns true if path is a string path leading to a directory, or to a symbolic link to a directory; false otherwise:
sizeReturns the size of file_name.
size?Returns nil if file_name doesn’t exist or has zero size, the size of the file otherwise.
zero?Returns true if the named file exists and has a zero size.
basenameReturns the last component of the filename given in file_name (after first stripping trailing separators), which can be formed using both File::SEPARATOR and File::ALT_SEPARATOR as the separator when File::ALT_SEPARATOR is not nil.
dirnameReturns all components of the filename given in file_name except the last one (after first stripping trailing separators). The filename can be formed using both File::SEPARATOR and File::ALT_SEPARATOR as the separator when File::ALT_SEPARATOR is not nil.
extnameReturns the extension (the portion of file name in path starting from the last period).
splitSplits the given string into a directory and a file component and returns them in a two-element array. See also File::dirname and File::basename.
joinReturns a new string formed by joining the strings using "/".
expand_pathConverts a pathname to an absolute pathname. Relative paths are referenced from the current working directory of the process unless dir_string is given, in which case it will be used as the starting point. The given pathname may start with a “~”, which expands to the process owner’s home directory (the environment variable HOME must be set correctly). “~user” expands to the named user’s home directory.
absolute_pathConverts a pathname to an absolute pathname. Relative paths are referenced from the current working directory of the process unless dir_string is given, in which case it will be used as the starting point. If the given pathname starts with a “~” it is NOT expanded, it is treated as a normal directory name.
realpathReturns the real (absolute) pathname of pathname in the actual filesystem not containing symlinks or useless dots.
identical?Returns true if the named files are identical.
File Manipulation
delete / unlinkDeletes the named files, returning the number of names passed as arguments. Raises an exception on any error. Since the underlying implementation relies on the unlink(2) system call, the type of exception raised depends on its error type (see linux.die.net/man/2/unlink) and has the form of e.g. Errno::ENOENT.
renameRenames the given file to the new name. Raises a SystemCallError if the file cannot be renamed.
symlinkCreates a symbolic link called new_name for the existing file old_name. Raises a NotImplemented exception on platforms that do not support symbolic links.
linkCreates a new name for an existing file using a hard link. Will not overwrite new_name if it already exists (raising a subclass of SystemCallError). Not available on all platforms.
chmodChanges permission bits on the named file(s) to the bit pattern represented by mode_int. Actual effects are operating system dependent (see the beginning of this section). On Unix systems, see chmod(2) for details. Returns the number of files processed.
chownChanges the owner and group of the named file(s) to the given numeric owner and group id’s. Only a process with superuser privileges may change the owner of a file. The current owner of a file may change the file’s group to any group to which the owner belongs. A nil or -1 owner or group id is ignored. Returns the number of files processed.
truncateTruncates the file file_name to be at most integer bytes long. Not available on all platforms.
utimeSets the access and modification times of each named file to the first two arguments. If a file is a symlink, this method acts upon its referent rather than the link itself; for the inverse behavior see File.lutime. Returns the number of file names in the argument list.
Access and Permission
readable?Returns true if the named file is readable by the effective user and group id of this process.
writable?Returns true if the named file is writable by the effective user and group id of this process.
executable?Returns true if the named file is executable by the effective user and group id of this process.
readable_real?Returns true if the named file is readable by the real user and group id of this process.
writable_real?Returns true if the named file is writable by the real user and group id of this process.
executable_real?Returns true if the named file is executable by the real user and group id of this process.
File Times
atimeReturns the last access time for the named file as a Time object.
mtimeReturns the modification time for the named file as a Time object.
ctimeReturns the change time for the named file (the time at which directory information about the file was changed, not the file itself).
FileUtils Module Methods
cpCopies files.
mvMoves entries.
rmRemoves entries at the paths in the given list (a single path or an array of paths) returns list, if it is an array, [list] otherwise.
rm_fEquivalent to:

FileUtils.rm(list, force: true, **kwargs)
rm_rRemoves entries at the paths in the given list (a single path or an array of paths); returns list, if it is an array, [list] otherwise.
rm_rfEquivalent to:

FileUtils.rm_r(list, force: true, **kwargs)
lnCreates hard links.
ln_sCreates symbolic links.
mkdirCreates directories at the paths in the given list (a single path or an array of paths); returns list if it is an array, [list] otherwise.
mkdir_pCreates directories at the paths in the given list (a single path or an array of paths), also creating ancestor directories as needed; returns list if it is an array, [list] otherwise.
rmdirRemoves directories at the paths in the given list (a single path or an array of paths); returns list, if it is an array, [list] otherwise.
chmod / chmod_RChanges permissions on the entries at the paths given in list (a single path or an array of paths) to the permissions given by mode; returns list if it is an array, [list]. _R is for recursive operations.
chown / chown_RChanges the owner and group on the entries at the paths given in list (a single path or an array of paths) to the given user and group; returns list if it is an array, [list]. _R is for recursive operations.
touchUpdates modification times (mtime) and access times (atime) of the entries given by the paths in list (a single path or an array of paths); returns list if it is an array, [list] otherwise.
Dir Class Methods
pwdReturns the path to the current working directory of this process as a string.
chdirChanges the current working directory of the process to the given string. When called without an argument, changes the directory to the value of the environment variable HOME, or LOGDIR. SystemCallError (probably Errno::ENOENT) if the target directory does not exist.
homeReturns the home directory of the current user or the named user if given.
entriesReturns an array containing all of the filenames in the given directory. Will raise a SystemCallError if the named directory doesn’t exist.
foreachCalls the block once for each entry in the named directory, passing the filename of each entry as a parameter to the block.
globExpands pattern, which is a pattern string or an Array of pattern strings, and returns an array containing the matching filenames. If a block is given, calls the block once for each matching filename, passing the filename as a parameter to the block.
mkdirMakes a new directory named by string, with permissions specified by the optional parameter anInteger. The permissions may be modified by the value of File::umask, and are ignored on NT. Raises a SystemCallError if the directory cannot be created. See also the discussion of permissions in the class documentation for File.
rmdir / deleteDeletes the named directory. Raises a subclass of SystemCallError if the directory isn’t empty.
IO Class Methods (Parent class of File)
readReads bytes from the stream; the stream must be opened for reading (see Access Modes):
writeWrites each of the given objects to self, which must be opened for writing (see Access Modes); returns the total number bytes written; each of objects that is not a string is converted via method to_s:
foreachCalls the block with each successive line read from the stream.
popenExecutes the given command cmd as a subprocess whose $stdin and $stdout are connected to a new stream io.
sysopenOpens the file at the given path with the given mode and permissions; returns the integer file descriptor.
copy_streamCopies from the given src to the given dst, returning the number of bytes copied.
pipeCreates a pair of pipe endpoints, read_io and write_io, connected to each other.
Pathname Class Methods (To a large degree, duplicated in File class)
newCreate a Pathname object from the given String (or String-like object). If path contains a NULL character (\0), an ArgumentError is raised.
basenameReturns the last component of the path.
dirnameReturns all but the last component of the path.
extnameReturns the file’s extension.
exist?Return true if the named file exists.
directory?With string object given, returns true if path is a string path leading to a directory, or to a symbolic link to a directory; false otherwise:
file?Returns true if the named file exists and is a regular file.
realpathReturns the real (absolute) pathname for self in the actual filesystem.
joinJoins the given pathnames onto self to create a new Pathname object. This is effectively the same as using Pathname#+ to append self and all arguments sequentially.
deleteRemoves a file or directory, using File.unlink if self is a file, or Dir.unlink as necessary.
unlinkRemoves a file or directory, using File.unlink if self is a file, or Dir.unlink as necessary.
renameRename the file.
chmodChanges file permissions.
chownChange owner and group of the file.
truncateTruncates the file to length bytes.