Search

How to Parse CSV with Modern C++

This post also shows how Modern C++ development works with Modern resources and tools.


How to Parse CSV with Modern C++


In this post we are going to see how to parse CSV with C++ without any ready-made library for this and use some modern tools and resources, such as:


What is CSV?

CSV means “Comma-separated values”. In short, they are text files used to store data as a “database”.

The RFC standard describes data as comma-separated, however, there are variants that inform in the first line the indication that which separator(sep=;, sep=|) will be used in the document, in fact, Microsoft Excel exports data to CSV with this variant.


To create our code we will need an example of CSV and for that we will use this example:

file.csv

Bjarne Stroustroup,1979,C++,bjarne@stroustroup.com
Dennis Ritchie,1970,C,dennis@ritchie.net
Maurice Wilkes,1940,Assembly,maurice@wilkes.us
Brian Kernighan,1977,AWK,brian@kernighan.org
Anonymous,,,anom@null.net

The code below uses strsep instead of strtok as we saw in this article, to notice the difference I suggest replacing those with strsep(&buf, delim) by strtok(NULL, delim) and note that the line that has empty data will have incorrect data.


Code C++

mkdir cppsv
cd cppsv
vim main.cpp

main.cpp

#include <fmt/format.h>
#include <fstream>
#include <cstring>
#include <array>

auto main(int argc, char **argv) -> int {
   if(argc > 1){
     std::string line {};
     std::ifstream file(argv[1]);
     if(file.is_open()){
       const char * delim = ",";
       std::array<std::string, 4> fields {"NAME", "YEAR", "LANG", "MAIL"};
       int i {};
       while(std::getline(file, line)){
         char * buf = &line[0];
         char * pline = strsep(&buf, delim);
         while(pline != NULL){
           fmt::print("{}: {}\n", fields[i], pline);
           pline = strsep(&buf, delim);
           ++i;
           if(i > 3){
             fmt::print("-------\n");
             i = 0;
           }
         }
       }
       file.close();
     }
   }else{
     fmt::print(stderr, "Use: {} file.csv\n", argv[0]);
   }
}

CMakeLists.txt file

cmake_minimum_required(VERSION 3.26.3)
project(cppsv
   LANGUAGES CXX
   VERSION 0.0.1
)
add_compile_options(-Wall -Werror -Wextra -Wpedantic)
set (CMAKE_CXX_STANDARD 23)
add_executable(a.out main.cpp)
find_package(fmt)
target_link_libraries(a.out PRIVATE fmt::fmt-header-only)

Contents of the directory and compiling the code

tree cppsv

cppsv
├── CMakeLists.txt
├── file.csv
└── main.cpp

0 directories, 3 files

Compiling and running:

cmake -B build .
cd build && make
./a.out ../file.csv

Very easy and a nice C++ exercise. Of course, you can still improve, for example, add separator checking to be compatible with the model generated by Excel. And even create a ready-made library. Another thing is that some CSV files may have the data protected by double quotes, so you can replace them with empty and so on.


Video

If you want to see the whole process step by step in video form, watch the content below, remembering that the video is in Portuguese, but the content is language independent, however, you can still use Youtube’s automatic translation.



cpp cppdaily


Share



Comments