Chapter 10 - File Input/Output OUTPUT TO A FILE Load and display the file named FORMOUT.C for your first example of writing data to a file. We begin as before with the "include" statement for "stdio.h", then define some variables for use in the example including a rather strange looking new type. The type "FILE" is used for a file variable and is defined in the "stdio.h" file. It is used to define a file pointer for use in file operations. The definition of C contains the requirement for a pointer to a "FILE", and as usual, the name can be any valid variable name. OPENING A FILE Before we can write to a file, we must open it. What this really means is that we must tell the system that we want to write to a file and what the filename is. We do this with the "fopen" function illustrated in the first line of the program. The file pointer, "fp" in our case, points to the file and two arguments are required in the parentheses, the filename first, followed by the file type. The filename is any valid DOS filename, and can be expressed in upper or lower case letters, or even mixed if you so desire. It is enclosed in double quotes. For this example we have chosen the name TENLINES.TXT. This file should not exist on your disk at this time. If you have a file with this name, you should change its name or move it because when we execute this program, its contents will be erased. If you don't have a file by this name, that is good because we will create one and put some data into it. READING ("r") The second parameter is the file attribute and can be any of three letters, "r", "w", or "a", and must be lower case. There are actually additional cases available in Turbo C to allow more flexible I/O. These are defined on page 95 of the Turbo C Reference Guide. When an "r" is used, the file is opened for reading, a "w" is used to indicate a file to be used for writing, and an "a" indicates that you desire to append additional data to the data already in an existing file. Opening a file for reading requires that the file already exist. If it does not exist, the file pointer will be set to NULL and can be checked by the program. WRITING ("w") When a file is opened for writing, it will be created if it does not already exist and it will be reset if it does Page 69 Chapter 10 - File Input/Output resulting in deletion of any data already there. APPENDING ("a") When a file is opened for appending, it will be created if it does not already exist and it will be initially empty. If it does exist, the data input point will be the end of the present data so that any new data will be added to any data that already exists in the file. OUTPUTTING TO THE FILE The job of actually outputting to the file is nearly identical to the outputting we have already done to the standard output device. The only real differences are the new function names and the addition of the file pointer as one of the function arguments. In the example program, "fprintf" replaces our familiar "printf" function name, and the file pointer defined earlier is the first argument within the parentheses. The remainder of the statement looks like, and in fact is identical to, the "printf" statement. CLOSING A FILE To close a file, you simply use the function "fclose" with the file pointer in the parentheses. Actually, in this simple program, it is not necessary to close the file because the system will close all open files before returning to DOS. It would be good programming practice for you to get in the habit of closing all files in spite of the fact that they will be closed automatically, because that would act as a reminder to you of what files are open at the end of each program. You can open a file for writing, close it, and reopen it for reading, then close it, and open it again for appending, etc. Each time you open it, you could use the same file pointer, or you could use a different one. The file pointer is simply a tool that you use to point to a file and you decide what file it will point to. Compile and run this program. When you run it, you will not get any output to the monitor because it doesn't generate any. After running it, look at your directory for a file named TENLINES.TXT and "type" it. That is where your output will be. Compare the output with that specified in the program. It should agree. Do not erase the file named TENLINES.TXT yet. We will use it in some of the other examples in this chapter. Page 70 Chapter 10 - File Input/Output OUTPUTTING A SINGLE CHARACTER AT A TIME Load the next example file, CHAROUT.C, and display it on your monitor. This program will illustrate how to output a single character at a time. The program begins with the "include" statement, then defines some variables including a file pointer. We have called the file pointer "point" this time, but we could have used any other valid variable name. We then define a string of characters to use in the output function using a "strcpy" function. We are ready to open the file for appending and we do so in the "fopen" function, except this time we use the lower cases for the filename. This is done simply to illustrate that DOS doesn't care about the case of the filename. Notice that the file will be opened for appending so we will add to the lines inserted during the last program. The program is actually two nested "for" loops. The outer loop is simply a count to ten so that we will go through the inner loop ten times. The inner loop calls the function "putc" repeatedly until a character in "others" is detected to be a zero. THE "putc" FUNCTION The part of the program we are interested in is the "putc" function. It outputs one character at a time, the character being the first argument in the parentheses and the file pointer being the second and last argument. Why the designer of C made the pointer first in the "fprintf" function, and last in the "putc" function is a good question for which there may be no answer. It seems like this would have been a good place to have used some consistency. When the textline "others" is exhausted, a newline is needed because a newline was not included in the definition above. A single "putc" is then executed which outputs the "\n" character to return the carriage and do a linefeed. When the outer loop has been executed ten times, the program closes the file and terminates. Compile and run this program but once again there will be no output to the monitor. Following execution of the program, "type" the file named TENLINES.TXT and you will see that the 10 new lines were added to the end of the 10 that already existed. If Page 71 Chapter 10 - File Input/Output you run it again, yet another 10 lines will be added. Once again, do not erase this file because we are still not finished with it. READING A FILE Load the file named READCHAR.C and display it on your monitor. This is our first program to read a file. This program begins with the familiar "include", some data definitions, and the file opening statement which should require no explanation except for the fact that an "r" is used here because we want to read it. In this program, we check to see that the file exists, and if it does, we execute the main body of the program. If it doesn't, we print a message and quit. If the file does not exist, the system will set the pointer equal to NULL which we can test. The main body of the program is one "do while" loop in which a single character is read from the file and output to the monitor until an EOF (end of file) is detected from the input file. The file is then closed and the program is terminated. CAUTION CAUTION CAUTION At this point, we have the potential for one of the most common and most perplexing problems of programming in C. The variable returned from the "getc" function is a character, so we could use a "char" variable for this purpose. There is a problem that could develop here if we happened to use an "unsigned char" however, because Turbo C returns a minus one which an "unsigned char" type variable is not capable of containing. An "unsigned char" type variable can only have the values of zero to 255, so it will return a 255 for a minus one in Turbo C. This is a very frustrating problem to try to find. The program simply can never find the EOF and will therefore never terminate the loop. This is easy to prevent, always use an "char" type variable for use in returning an EOF. There is another problem with this program but we will worry about it when we get to the next program and solve it with the one following that. After you compile and run this program and are satisfied with the results, it would be a good exercise to change the name of "TENLINES.TXT" and run the program again to see that the NULL test actually works as stated. Be sure Page 72 Chapter 10 - File Input/Output to change the name back because we are still not finished with "TENLINES.TXT". READING A WORD AT A TIME Load and display the file named READTEXT.C for an example of how to read a word at a time. This program is nearly identical as the last except that this program uses the "fscanf" function to read in a string at a time. Because the "fscanf" function stops reading when it finds a space or a newline character, it will read a word at a time, and display the results one word to a line. You will see this when you compile and run it, but first we must examine a programming problem. THIS IS A PROBLEM Inspection of the program will reveal that when we read data in and detect the EOF, we print out something before we check for the EOF resulting in an extra line of printout. What we usually print out is the same thing printed on the prior pass through the loop because it is still in the buffer "oneword". We therefore must check for EOF before we execute the "printf" function. This has been done in READGOOD.C, which you will shortly examine, compile, and execute. Compile and execute the original program we have been studying, READTEXT.C and observe the output. If you haven't changed TENLINES.TXT you will end up with "Additional" and "lines." on two separate lines with an extra "lines." displayed because of the "printf" before checking for EOF. Compile and execute READGOOD.C and observe that the extra "lines." does not get displayed because of the extra check for the EOF in the middle of the loop. This was also the problem referred to when we looked at READCHAR.C, but I chose not to expound on it there because the error in the output was not so obvious. FINALLY, WE READ A FULL LINE Load and display the file READLINE.C for an example of reading a complete line. This program is very similar to those we have been studying except for the addition of a new quantity, the NULL. We are using "fgets" which reads in an entire line, including the newline character into a buffer. The buffer to be read into is the first argument in the function call, Page 73 Chapter 10 - File Input/Output and the maximum number of characters to read is the second argument, followed by the file pointer. This function will read characters into the input buffer until it either finds a newline character, or it reads the maximum number of characters allowed minus one. It leaves one character for the end of string NULL character. In addition, if it finds an EOF, it will return a value of NULL. In our example, when the EOF is found, the pointer "c" will be assigned the value of NULL. NULL is defined as zero in your "stdio.h" file. When we find that "c" has been assigned the value of NULL, we can stop processing data, but we must check before we print just like in the last program. Last of course, we close the file. HOW TO USE A VARIABLE FILENAME Load and display the file ANYFILE.C for an example of reading from any file. This program asks the user for the filename desired, reads in the filename and opens that file for reading. The entire file is then read and displayed on the monitor. It should pose no problems to your understanding so no additional comments will be made. Compile and run this program. When it requests a filename, enter the name and extension of any text file available, even one of the example C programs. HOW DO WE PRINT? Load the last example file in this chapter, the one named PRINTDAT.C for an example of how to print. This program should not present any surprises to you so we will move very quickly through it. Once again, we open TENLINES.TXT for reading and we open PRN for writing. Printing is identical to writing data to a disk file except that we use a standard name for the filename. Turbo C uses the reserved "filename" of PRN that instructs the compiler to send the output to the printer. The program is simply a loop in which a character is read, and if it is not the EOF, it is displayed and printed. When the EOF is found, the input file and the printer output files are both closed. You can now erase TENLINES.TXT from your disk. We will not be using it in any of the later chapters. Page 74 Chapter 10 - File Input/Output PROGRAMMING EXERCISES 1. Write a program that will prompt for a filename for a read file, prompt for a filename for a write file, and open both plus a file to the printer. Enter a loop that will read a character, and output it to the file, the printer, and the monitor. Stop at EOF. 2. Prompt for a filename to read. Read the file a line at a time and display it on the monitor with line numbers. Page 75