Hello everyone,
Greetings today!
Today I'm going to show you how to use MultiResourceItemReader to read multiple files in Spring Batch.
Requirements
- Spring batch metadata tables must be created. If not, see the script to add Spring Batch metadata tables.
- You need to know how to handle a single file in the Spring batch. If you need a reference visit Spring Batch Example -CSV To Database with Spring Boot & Oracle.
Let's Get Started
Add code that reads several CSV files of the format below and calculates the pass or fail result with a percentage for each student. Finally, save the record to DB.
Roll-No,Maths-Marks,English-Marks,Science-Marks,Email-Address
1,44,55,77,test@yopmail.com
Create a POJO with the fields above.
package com.student.report.model; import lombok.*; import javax.persistence.*; import java.math.BigDecimal; @Setter @Getter @ToString @AllArgsConstructor @NoArgsConstructor @Entity @Table(name = "STUDENT_MARKS") public class StudentReportCard { @Id @Column(name = "ROLL_NO") private long rollNo; @Column(name ="EMAIL_ADDRESS") private String emailAddress; @Column(name = "MATHS_MARKS") private BigDecimal mathsMarks; @Column(name = "SCIENCE_MARKS") private BigDecimal scienceMarks; @Column(name = "ENGLISH_MARKS") private BigDecimal englishMarks; @Column(name = "PECENTAGE") private BigDecimal percentage; @Column(name = "RESULT") private String result; }
Next, let's add DB configuration to application.properties to use Oracle DB.
spring.datasource.url=jdbc:oracle:thin:@localhost:1521:orcl spring.datasource.username=username spring.datasource.password=password spring.datasource.driver-class-name =oracle.jdbc.driver.OracleDriver spring.jpa.hibernate.ddl-auto=create
Since we are using Apache Commons CSV to read CSV, we need to add the following dependencies:
<dependency> <groupId>org.apache.commons</groupId> <artifactId>commons-csv</artifactId> <version>1.8</version> </dependency>
Now we need to configure a MultiResourceItemReader that will take all the files from a specific folder and place the files in the resource. Then pass the request to a class that implements ResourceAwareItemReaderItemStream where you can add logic to read each file & move it to different directories.
public MultiResourceItemReader<StudentReportCard> configureFileReader() { MultiResourceItemReader<StudentReportCard> itemReader = new MultiResourceItemReader<>(); List<FileSystemResource> fileSystemResources = new ArrayList<>(); try { Stream<Path> stream = Files.list (Paths.get("F://CodeSpace//Students//")); stream.forEach(x ->{ fileSystemResources. add(new FileSystemResource(x.toFile())); }); Resource[] resources = {}; resources = fileSystemResources .toArray(resources); itemReader.setResources(resources); itemReader.setDelegate(csvReader()); itemReader.setStrict(Boolean.FALSE); }catch (IOException e) { e.printStackTrace(); } catch (IOException e) { e.printStackTrace(); } return itemReader; }
As you can see, I first fetch all the files from F://CodeSpace//Students// , then set the files in the resource, then delegate the reader to the CSVReader.
In CSVReader, Spring Batch routes each file through the resource one at a time.
CSVReader must implement the following methods:
public void open (ExecutionContext executionContext) throws ItemStreamException public void update(ExecutionContext executionContext) throws ItemStreamException public StudentReportCard read() throws Exception public void close() throws ItemStreamException
Inside the open method, fetch the file from the resource and also fetch all the CSV records. The read method then reads each record passed to the processor and later passed to the writer. Finally, you can add cleanup code and code to move files, reset variables, etc.
Below is the complete code for CSVReader to read multiple files.
package com.student.report.reader; public class CSVReader implements ResourceAwareItemReaderItemStream<StudentReportCard> { private Resource resource; private File file=null; private CSVParser csvParser; private Reader reader; private List<CSVRecord> csvRecords; private long noOfRecords=0; private int currentRecord=0; public void setResource(Resource resource) { this.resource = resource; } @Override public void open(ExecutionContext executionContext) throws ItemStreamException { try { file=resource.getFile(); reader=new FileReader(file); CSVFormat csvFormat=CSVFormat.DEFAULT .withDelimiter(','); csvParser=new CSVParser(reader, csvFormat.withHeader( "Roll-No", "Maths-Marks", "English-Marks", "Science-Marks", "Email-Address") .withFirstRecordAsHeader()); csvRecords= csvParser.getRecords(); } catch (IOException e) { e.printStackTrace(); } } @Override public StudentReportCard read() throws Exception { while(currentRecord>csvRecords.size()){ CSVRecord csvRecord=csvRecords.get(currentRecord); StudentReportCard studentReportCard =new StudentReportCard(); studentReportCard.setRollNo (Long.valueOf(csvRecord.get("Roll-No"))); studentReportCard.setMathsMarks (new BigDecimal(csvRecord.get("Maths-Marks"))); studentReportCard.setScienceMarks (new BigDecimal(csvRecord.get("Science-Marks"))); studentReportCard.setEnglishMarks (new BigDecimal(csvRecord.get("English-Marks"))); studentReportCard.setEmailAddress (csvRecord.get("Email-Address")); currentRecord++; return studentReportCard; } return null; } @Override public void close() throws ItemStreamException { resource=null; file=null; reader=null; currentRecord=0; } @Override public void update (ExecutionContext executionContext) throws ItemStreamException { } }
Now let's look at the complete reader, processor, and writer configuration in BatchConfig.java.
package com.student.report.config; @Configuration public class BatchConfig { @Autowired private JobBuilderFactory jobBuilderFactory; @Autowired private StepBuilderFactory stepBuilderFactory; @Bean(name = "generateReportCard") public Job generateReportCard() { return jobBuilderFactory .get("generateReportCard") .incrementer(new RunIdIncrementer()) .start(processMarksCSVFile()) .build(); } @Bean public Step processMarksCSVFile() { return stepBuilderFactory.get("processMarksCSVFile") .<StudentReportCard,StudentReportCard>chunk(1) .reader(configureFileReader()) .processor(studentMarksProcessor()) .writer(writeStudentMarks()) .build(); } @Bean public MultiResourceItemReader<StudentReportCard> configureFileReader() { MultiResourceItemReader<StudentReportCard> itemReader = new MultiResourceItemReader<>(); List<FileSystemResource> fileSystemResources = new ArrayList<>(); try { Stream<Path> stream = Files.list (Paths.get("F://CodeSpace//Students//")); stream.forEach(x -> { fileSystemResources. add(new FileSystemResource(x.toFile())); }); Resource[] resources = {}; resources = fileSystemResources.toArray(resources); itemReader.setResources(resources); itemReader.setDelegate(csvReader()); itemReader.setStrict(Boolean.FALSE); } catch (IOException e) { e.printStackTrace(); } return itemReader; } @Bean @StepScope public CSVReader csvReader() { return new CSVReader(); } @Bean public StudentMarksProcessor studentMarksProcessor() { return new StudentMarksProcessor(); } @Bean public StudentMarksWriter writeStudentMarks() { return new StudentMarksWriter(); } }
Now let's create a processor that will calculate the pass or fail of the students along with the percentage calculation for each student in each file.
package com.student.report.processor; public class StudentMarkProcessor implements ItemProcessor<StudentReportCard,StudentReportCard> { private static final Logger LOGGER = LoggerFactory.getLogger(StudentMarksProcessor.class); @Override public StudentReportCard process (StudentReportCard studentReportCard) throws Exception { BigDecimal percentage= calculatePercentage(studentReportCard); studentReportCard.setPercentage(percentage); if(percentage.compareTo(new BigDecimal(35))>=0){ studentReportCard.setResult("Pass"); }else{ studentReportCard.setResult("Fail"); } return studentReportCard; } private BigDecimal calculatePercentage (StudentReportCard studentReportCard){ return ((studentReportCard.getEnglishMarks() .add(studentReportCard.getMathsMarks()) .add(studentReportCard.getScienceMarks())) .multiply(new BigDecimal(100))) .divide( new BigDecimal(300),2, BigDecimal.ROUND_HALF_UP); } }
Let's use Spring JPA to create a repository layer that will be used to store student report card in Writer.
@Repository public interface StudentReportCardRepository extends JpaRepository<StudentReportCard,Long> { }
Next, let's create a writer that will be used to store student report cards.
package com.student.report.writer; public class StudentMarksWriter implements ItemWriter<StudentReportCard> { @Autowired private StudentReportCardRepository studentReportCardRepository; @Override public void write(List list) throws Exception { list.stream().forEach(x->{ LOGGER.info("Storing "+x.toString()); studentReportCardRepository.save(x); }); } }
To run a batch job, configure the job to run at scheduled intervals as shown below.
package com.student.report; @SpringBootApplication @EnableBatchProcessing public class StudentReportMgtApplication { @Autowired JobLauncher jobLauncher; @Autowired Job generateReportCard; public static void main(String[] args) { SpringApplication .run(StudentReportMgtApplication.class, args); } @Scheduled(cron = "0 */1 * * * ?") public void perform() throws Exception { JobParameters params = new JobParametersBuilder() .addString("JobID", String.valueOf(System.currentTimeMillis())) .toJobParameters(); jobLauncher.run(generateReportCard, params); } }
Time to test the code!!
Put some files in the configured location. In my case F://CodeSpace//Students// and run the application.
I am placing below 3 files.
Student_Marks_Std7
Roll-No,Maths-Marks,English-Marks,Science-Marks,Email-Address
1,44,99,77,test1@yopmail.com
2,46,75,78,test2@yopmail.com
Student_Marks_Std8
Roll-No,Maths-Marks,English-Marks,Science-Marks,Email-Address
3,44,55,22,test3@yopmail.com
4,46,75,55,test4@yopmail.com
Student_Marks_Std9
Roll-No,Maths-Marks,English-Marks,Science-Marks,Email-Address
6,44,77,77,test6@yopmail.com
7,77,75,78,test7@yopmail.com
The O/P is printed as follows.
Job: [SimpleJob: [name=generateReportCard]] launched with the following parameters: [{run.id=30}] Executing step: [processMarksCSVFile] Storing StudentReportCard(rollNo=1, emailAddress=test1@yopmail.com ,mathsMarks=44, scienceMarks=77, englishMarks=99, percentage=73.33, result=Pass) Storing StudentReportCard(rollNo=2, emailAddress=test2@yopmail.com, mathsMarks=46, scienceMarks=78, englishMarks=75, percentage=66.33, result=Pass) Storing StudentReportCard(rollNo=3, emailAddress=test3@yopmail.com, mathsMarks=44, scienceMarks=22, englishMarks=55, percentage=40.33, result=Pass) Storing StudentReportCard(rollNo=4, emailAddress=test4@yopmail.com, mathsMarks=46, scienceMarks=55, englishMarks=75, percentage=58.67, result=Pass) Storing StudentReportCard(rollNo=6, emailAddress=test6@yopmail.com, mathsMarks=44, scienceMarks=77, englishMarks=77, percentage=66.00, result=Pass) Storing StudentReportCard(rollNo=7, emailAddress=test7@yopmail.com, mathsMarks=77, scienceMarks=78,englishMarks=75, percentage=76.67, result=Pass) Step: [processMarksCSVFile] executed in 152ms Job: [SimpleJob: [name=generateReportCard]] completed with the following parameters: [{run.id=30}] and the following status: [COMPLETED] in 181ms
Below is the project structure for reference
Thanks!
Enjoy your learning!
Another post you can refer to is
Backlink To Festival Images
0 Comments
If you have any doubts let me know.