Last updated on June 30th, 2024 at 03:20 pm
Table of Contents
Standard File Operations
Ruby can natively read and write many different file types. Common to all file types is the need to open, read, write and close.
Opening
A file is opened in Ruby using the open method, which in its simplest form just takes a file name and attempts to open the file for reading.
my_file = File.open(filename)The open method can take parameters and options, however. For example, if you want to open a file for read only, then you would add the mode parameter ‘r’.
my_file = File.open(filename, 'r')You can also open the file for writing by using ‘w’ on the mode parameter.
my_file = File.open(filename, 'w')The mode parameters available to you are:
| Mode | Purpose | 
| ‘r’ | Read only, starts reading at the beginning of the file | 
| ‘r+’ | Read-write, starts at the beginning of the file | 
| ‘w’ | Write only. Truncates the file if it already exists. Creates a new file if one does not exist. | 
| ‘w+’ | Read-write. Truncates the file if it already exists. Creates a new file if one does not exist. | 
| ‘a’ | Write-only. Each write call appends data to end of file. Creates a new file if one does not exist. | 
| ‘a+’ | Read-write. Each write call appends data to end of file. Creates a new file if one does not exist. | 
| Additional Modes – must be used in accompaniment with previous modes | |
| ‘b’ | Binary file mode. Will not use EOL conversion. Sets external encoding to ASCII-8BIT by default | 
| ‘t’ | Text file mode. | 
For example, let’s assume we want to open a file write-only and binary.
my_file = File.open(filename, 'wb')Reading
If you want to open a file and read in one statement, then the read method is just what you’re looking for. It will open, read and close the file in one statement.
my_data = File.read("some_data.txt")The structure of the data in the assigned variable is dependent upon the data being read. If you are reading simple ASCII text, then the data will be a string separated by newlines (\n) .
If your application requires that you separate the open, read and close operations, you can break these into their separate methods.
my_file = File.open('some_data.txt')
my_data = my_file.readlines;
my_file.closeIf your application requires that you read the file one line at a time, then you can use the readline method instead of the readlines method.
my_file = File.open('some_data.txt')
while (my_line = my_file.readline) 
 ... do something useful with the data just read in ...
end
rescue EOFError
my_file.closeOne quick note – if you are looking for a summary of commands available to operate on Files, you not only have the File class to look into, but also the IO class because File inherits from IO.
Writing
Writing to a file is just as simple as reading from it, with the difference that the file has to be opened with the write option.
my_file = File.open('some_data.txt')
my_file.write 'text to write'
my_file.closeCSV Files
CSV is a very commonly used file format and support for it is included in the standard Ruby library (you will, however, have to insert the statement require 'csv' prior to using the CSV class) and it operates very much the same as the File class.
require 'csv'
my_file = CSV.open('some_data.csv')
while (my_line = my_file.readline) 
 ... do something useful with the data just read in ...
end
rescue EOFError # for example...
  # Do something to rescue if you want
my_file.close
Another, more convenient way of reading in a CSV file would be to use the foreach method.
require 'csv'
CSV.foreach(filename, headers: true).with_index(1) do |row, row_index|
    row.each_with_index do |(header, value), index|
      # do something with the row
    end
endIn the example above, if you have an error processing the CSV file, you can use row_index to tell the user what row the problem occurred on. Because the iteration uses .with_index(1), which tells the system to use 1 as the starting row number, we don’t have to use row_index  + 1 when telling the user which row the problem occurred on. Opening the file with the headers: true option means that each row will be read in as a hash with the column header as the key and the column value as the value, which allows you to use the column header in your error message as well.
You may also notice that using the foreach iterator, the file is automatically closed after the file read is completed.
Digging in CSV files
In the Ruby Basics section, we covered using the dig method for arrays, hashes and structs. You can also use the dig command with CSV files. There are two options for the dig method within CSV data – digging tables and digging rows.
To dig a table, you would supply both the row number and the column name:
require 'csv'
# Sample CSV content
# name,age,city
# Alice,30,New York
# Bob,25,Los Angeles
# Charlie,35,Chicago
csv_text = <<~CSV
  name,age,city
  Alice,30,New York
  Bob,25,Los Angeles
  Charlie,35,Chicago
CSV
csv_table = CSV.parse(csv_text, headers: true)
# Using dig to access data in CSV::Table
alice_city = csv_table.dig(0, 'city')
bob_age = csv_table.dig(1, 'age')
puts "Alice's city: #{alice_city}" # => Alice's city: New York
puts "Bob's age: #{bob_age}" # => Bob's age: 25To dig in a row, you simply supply the column name:
require 'csv'
# Sample CSV content
# name,age,city
# Alice,30,New York
# Bob,25,Los Angeles
# Charlie,35,Chicago
csv_text = <<~CSV
  name,age,city
  Alice,30,New York
  Bob,25,Los Angeles
  Charlie,35,Chicago
CSV
csv_table = CSV.parse(csv_text, headers: true)
# Using dig to access data in CSV::Row
alice_row = csv_table[0]
bob_row = csv_table[1]
alice_age = alice_row.dig('age')
bob_city = bob_row.dig('city')
puts "Alice's age: #{alice_age}" # => Alice's age: 30
puts "Bob's city: #{bob_city}" # => require 'csv'
# Sample CSV content
# name,age,city
# Alice,30,New York
# Bob,25,Los Angeles
# Charlie,35,Chicago
csv_text = <<~CSV
  name,age,city
  Alice,30,New York
  Bob,25,Los Angeles
  Charlie,35,Chicago
CSV
csv_table = CSV.parse(csv_text, headers: true)
# Using dig to access data in CSV::Row
alice_row = csv_table[0]
bob_row = csv_table[1]
alice_age = alice_row.dig('age')
bob_city = bob_row.dig('city')
puts "Alice's age: #{alice_age}" # => Alice's age: 30
puts "Bob's city: #{bob_city}" # => Bob's city: Los AngelesSimilar to arrays and hashes, if you dig for an element that is not found, dig will return a nil instead of an error.
require 'csv'
# Sample CSV content with missing fields
# name,age,city
# Alice,30,New York
# Bob,,Los Angeles
# Charlie,35,
csv_text = <<~CSV
  name,age,city
  Alice,30,New York
  Bob,,Los Angeles
  Charlie,35,
CSV
csv_table = CSV.parse(csv_text, headers: true)
# Using dig to safely access data that might be missing
bob_age = csv_table.dig(1, 'age') || 'Unknown'
charlie_city = csv_table.dig(2, 'city') || 'Unknown'
puts "Bob's age: #{bob_age}" # => Bob's age: Unknown
puts "Charlie's city: #{charlie_city}" # => Charlie's city: UnknownYAML Files
Ruby also supports reading of YAML files as part of the standard library. As with the CSV support, you will need to include the YAML support with a require statement. Because YAML files are not line oriented, you will open, read everything in / write everything out, and close the file in a single method.
To read YAML files, you will need to use the load_file method.
require 'yaml'
my_yml_data = YAML.load_file('some_data.yml')Writing to a YAML file is a bit more complex. First, you need to dump an object to a YAML object and then write the object to a file.
require 'yaml'
Auto = Struct.new(:make, :model, :year)
audi = Auto.new('Audi', 'A1', 2004)
serialized_auto = YAML.dump(audi)
File.write('Autos.yml', serialized_auto)Alternatively, you can use to_yaml instead of the dump method.
require 'yaml'
Auto = Struct.new(:make, :model, :year)
yaml_auto = Auto.new('Audi', 'A1', 2004).to_yaml
File.write('Autos.yml', yaml_auto)JSON Files
JSON files are similar toYAML files, in that they contain structured data and cannot be read in one line at a time. Instead, you will need to read the contents in and use JSON to parse the contents of the file. The parsed contents are then a typical hash.
require 'json'
# Read the file
file_content = File.read('data.json')
# Parse the JSON content
data = JSON.parse(file_content)
# Now you can use the data
data['users'].each do |user|
  puts "User ID: #{user['id']}, Name: #{user['name']}, Email: #{user['email']}"
end
Normally, when creating a JSON file, it is formatted so that it is easier to read in a text editor. To pretty format output to JSON file, you will need to use the method pretty_generate.
require 'json'
# Convert the data to JSON format
json_data = JSON.pretty_generate('output.json')
# Write the JSON data to the file
File.open(file_path, 'w') do |file|
  file.write(json_data)
endFile Utilities
Below is a table containing an exhaustive list of the various file utilities available in Ruby. You will see that there is a great deal of duplication between various classes and some methods are just calling a method of the same name for a different class.
| Method Name | Method Purpose | 
| File Class Methods | |
| Creation and Opening | |
| new | Opens the file at the given pathaccording to the givenmode; creates and returns a new File object for that file. | 
| open | Creates a new File object, via File.newwith the given arguments. | 
| Reading and Writing | |
| read | Reads the full contents of the file. | 
| readlines | Reads and returns all remaining line from the stream; does not modify. | 
| write | Writes each of the given objectstoself, which must be opened for writing; returns the total number bytes written; each ofobjectsthat is not a string is converted via methodto_s: | 
| binread | Behaves like IO.read, except that the stream is opened in binary mode with ASCII-8BIT encoding. | 
| binwrite | Behaves like IO.write, except that the stream is opened in binary mode with ASCII-8BIT encoding. | 
| foreach | Calls the block with each successive line read from the stream. | 
| File Information | |
| exist? | Return trueif the named file exists. | 
| file? | Returns trueif the namedfileexists and is a regular file. | 
| directory? | With string objectgiven, returnstrueifpathis a string path leading to a directory, or to a symbolic link to a directory;falseotherwise: | 
| size | Returns the size of file_name. | 
| size? | Returns niliffile_namedoesn’t exist or has zero size, the size of the file otherwise. | 
| zero? | Returns trueif the named file exists and has a zero size. | 
| basename | Returns the last component of the filename given in file_name (after first stripping trailing separators), which can be formed using both File::SEPARATORandFile::ALT_SEPARATORas the separator whenFile::ALT_SEPARATORis notnil. | 
| dirname | Returns all components of the filename given in file_name except the last one (after first stripping trailing separators). The filename can be formed using both File::SEPARATORandFile::ALT_SEPARATORas the separator whenFile::ALT_SEPARATORis notnil. | 
| extname | Returns the extension (the portion of file name in pathstarting from the last period). | 
| split | Splits the given string into a directory and a file component and returns them in a two-element array. See also File::dirnameandFile::basename. | 
| join | Returns a new string formed by joining the strings using "/". | 
| expand_path | Converts a pathname to an absolute pathname. Relative paths are referenced from the current working directory of the process unless dir_stringis given, in which case it will be used as the starting point. The given pathname may start with a “~”, which expands to the process owner’s home directory (the environment variableHOMEmust be set correctly). “~user” expands to the named user’s home directory. | 
| absolute_path | Converts a pathname to an absolute pathname. Relative paths are referenced from the current working directory of the process unless dir_string is given, in which case it will be used as the starting point. If the given pathname starts with a “ ~” it is NOT expanded, it is treated as a normal directory name. | 
| realpath | Returns the real (absolute) pathname of pathname in the actual filesystem not containing symlinks or useless dots. | 
| identical? | Returns trueif the named files are identical. | 
| File Manipulation | |
| delete / unlink | Deletes the named files, returning the number of names passed as arguments. Raises an exception on any error. Since the underlying implementation relies on the unlink(2)system call, the type of exception raised depends on its error type (see linux.die.net/man/2/unlink) and has the form of e.g. Errno::ENOENT. | 
| rename | Renames the given file to the new name. Raises a SystemCallErrorif the file cannot be renamed. | 
| symlink | Creates a symbolic link called new_name for the existing file old_name. Raises a NotImplemented exception on platforms that do not support symbolic links. | 
| link | Creates a new name for an existing file using a hard link. Will not overwrite new_name if it already exists (raising a subclass of SystemCallError). Not available on all platforms. | 
| chmod | Changes permission bits on the named file(s) to the bit pattern represented by mode_int. Actual effects are operating system dependent (see the beginning of this section). On Unix systems, see chmod(2)for details. Returns the number of files processed. | 
| chown | Changes the owner and group of the named file(s) to the given numeric owner and group id’s. Only a process with superuser privileges may change the owner of a file. The current owner of a file may change the file’s group to any group to which the owner belongs. A nilor -1 owner or group id is ignored. Returns the number of files processed. | 
| truncate | Truncates the file file_name to be at most integer bytes long. Not available on all platforms. | 
| utime | Sets the access and modification times of each named file to the first two arguments. If a file is a symlink, this method acts upon its referent rather than the link itself; for the inverse behavior see File.lutime. Returns the number of file names in the argument list. | 
| Access and Permission | |
| readable? | Returns trueif the named file is readable by the effective user and group id of this process. | 
| writable? | Returns trueif the named file is writable by the effective user and group id of this process. | 
| executable? | Returns trueif the named file is executable by the effective user and group id of this process. | 
| readable_real? | Returns trueif the named file is readable by the real user and group id of this process. | 
| writable_real? | Returns trueif the named file is writable by the real user and group id of this process. | 
| executable_real? | Returns trueif the named file is executable by the real user and group id of this process. | 
| File Times | |
| atime | Returns the last access time for the named file as a Timeobject. | 
| mtime | Returns the modification time for the named file as a Timeobject. | 
| ctime | Returns the change time for the named file (the time at which directory information about the file was changed, not the file itself). | 
| FileUtils Module Methods | |
| cp | Copies files. | 
| mv | Moves entries. | 
| rm | Removes entries at the paths in the given list(a single path or an array of paths) returnslist, if it is an array,[list]otherwise. | 
| rm_f | Equivalent to: FileUtils.rm(list, force: true, **kwargs) | 
| rm_r | Removes entries at the paths in the given list(a single path or an array of paths); returnslist, if it is an array,[list]otherwise. | 
| rm_rf | Equivalent to: FileUtils.rm_r(list, force: true, **kwargs) | 
| ln | Creates hard links. | 
| ln_s | Creates symbolic links. | 
| mkdir | Creates directories at the paths in the given list(a single path or an array of paths); returnslistif it is an array,[list]otherwise. | 
| mkdir_p | Creates directories at the paths in the given list(a single path or an array of paths), also creating ancestor directories as needed; returnslistif it is an array,[list]otherwise. | 
| rmdir | Removes directories at the paths in the given list(a single path or an array of paths); returnslist, if it is an array,[list]otherwise. | 
| chmod / chmod_R | Changes permissions on the entries at the paths given in list(a single path or an array of paths) to the permissions given bymode; returnslistif it is an array,[list]. _R is for recursive operations. | 
| chown / chown_R | Changes the owner and group on the entries at the paths given in list(a single path or an array of paths) to the givenuserandgroup; returnslistif it is an array,[list]. _R is for recursive operations. | 
| touch | Updates modification times (mtime) and access times (atime) of the entries given by the paths in list(a single path or an array of paths); returnslistif it is an array,[list]otherwise. | 
| Dir Class Methods | |
| pwd | Returns the path to the current working directory of this process as a string. | 
| chdir | Changes the current working directory of the process to the given string. When called without an argument, changes the directory to the value of the environment variable HOME, orLOGDIR.SystemCallError(probably Errno::ENOENT) if the target directory does not exist. | 
| home | Returns the home directory of the current user or the named user if given. | 
| entries | Returns an array containing all of the filenames in the given directory. Will raise a SystemCallErrorif the named directory doesn’t exist. | 
| foreach | Calls the block once for each entry in the named directory, passing the filename of each entry as a parameter to the block. | 
| glob | Expands pattern, which is a pattern string or anArrayof pattern strings, and returns an array containing the matching filenames. If a block is given, calls the block once for each matching filename, passing the filename as a parameter to the block. | 
| mkdir | Makes a new directory named by string, with permissions specified by the optional parameter anInteger. The permissions may be modified by the value of File::umask, and are ignored on NT. Raises aSystemCallErrorif the directory cannot be created. See also the discussion of permissions in the class documentation forFile. | 
| rmdir / delete | Deletes the named directory. Raises a subclass of SystemCallErrorif the directory isn’t empty. | 
| IO Class Methods (Parent class of File) | |
| read | Reads bytes from the stream; the stream must be opened for reading (see Access Modes): | 
| write | Writes each of the given objectstoself, which must be opened for writing (see Access Modes); returns the total number bytes written; each ofobjectsthat is not a string is converted via methodto_s: | 
| foreach | Calls the block with each successive line read from the stream. | 
| popen | Executes the given command cmdas a subprocess whose $stdin and $stdout are connected to a new streamio. | 
| sysopen | Opens the file at the given path with the given mode and permissions; returns the integer file descriptor. | 
| copy_stream | Copies from the given srcto the givendst, returning the number of bytes copied. | 
| pipe | Creates a pair of pipe endpoints, read_ioandwrite_io, connected to each other. | 
| Pathname Class Methods (To a large degree, duplicated in File class) | |
| new | Create a Pathnameobject from the givenString(or String-like object). Ifpathcontains a NULL character (\0), anArgumentErroris raised. | 
| basename | Returns the last component of the path. | 
| dirname | Returns all but the last component of the path. | 
| extname | Returns the file’s extension. | 
| exist? | Return trueif the named file exists. | 
| directory? | With string objectgiven, returnstrueifpathis a string path leading to a directory, or to a symbolic link to a directory;falseotherwise: | 
| file? | Returns trueif the namedfileexists and is a regular file. | 
| realpath | Returns the real (absolute) pathname for selfin the actual filesystem. | 
| join | Joins the given pathnames onto selfto create a newPathnameobject. This is effectively the same as usingPathname#+to appendselfand all arguments sequentially. | 
| delete | Removes a file or directory, using File.unlinkifselfis a file, orDir.unlinkas necessary. | 
| unlink | Removes a file or directory, using File.unlinkifselfis a file, orDir.unlinkas necessary. | 
| rename | Rename the file. | 
| chmod | Changes file permissions. | 
| chown | Change owner and group of the file. | 
| truncate | Truncates the file to lengthbytes. |