Back to the VfsCookbook

Overview

This is a basic example to use VFS to retrieve files from a remote system using the SFTP protocol. Files matching a specified regular expression are retrieved.

Example Configuration

For the purposes of this example the remote system is named "sftpremote.example.com". The files to be retrieved are in a directory named /data/source/fires/smoke and the files are named smokeYearMoDy_wkt.txt. Thus the data file for March 25, 2008 is named "smoke20080325_wkt.txt".

The downloaded files will be received in the local directory /local/received/fires/smoke.

Connect to the remote sftpremote.example.com system using the login "smokey" and password "bear".

Key Concepts

Access to a remote system using SFTP uses the SSH secure shell protocol. Although the behavior is similar to FTP, it is not FTP run over a secure connection. There are some differences between FTP and SFTP that should be noted. One of these it the lack of the FTP binary/ASCII transfer mode, in SFTP all transfers are binary as if they were executed with an "scp" (secure copy) command.

This example code uses a regular expression to match files on the remote system, so that not all of the files in the source directory are transferred. The filePatternString is set to ".*/smoke\\d{8}_wkt
.txt". This has the regular expression components of:

  • ".*/" matches any path that precedes the filename. The "." is wildcard character, "*" specifies that 0 or more of these may be present. The "/" matches the directory separator.
  • "
    d{8}" matches 8 digits (4 for the year, 2 for the month number, and 2 for the day of the month). The doubled backslash is to escape a single backslash for Java string literals. "\d" indicates any digit, and "{8}" specifies exactly 8 of the preceding character or pattern.
  • "
    ." is an escaped "\." which means a literal period instead of a wild card interpretation of "."

It is important to call the close() method of the DefaultFileSystemManager to clean up any temporary files and close all providers. Otherwise the program will appear to hang after downloading files.

Source Code

The code provided below is for a Maven 2 project.

pom.xml Project File

The pom.xml file defines how the project is built for maven 2:

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
    <modelVersion>4.0.0</modelVersion>
    <groupId>gov.noaa.eds.byExample</groupId>
    <artifactId>trySimpleVfsSftp</artifactId>
    <packaging>jar</packaging>
    <version>1.0-SNAPSHOT</version>
    <name>trySimpleVfsSftp</name>
    <url>http://maven.apache.org</url>
    <build>
        <extensions>
            <extension>
                <groupId>org.apache.maven.wagon</groupId>
                <artifactId>wagon-ssh-external</artifactId>
                <version>1.0-beta-2</version>
            </extension>
        </extensions>
        <plugins>
            <plugin>
                <artifactId>maven-compiler-plugin</artifactId>
                <configuration>
                    <source>1.5</source>
                    <target>1.5</target>
                </configuration>
            </plugin>
            <plugin>
                <artifactId>maven-assembly-plugin</artifactId>
                <configuration>
                    <descriptorRefs>
                        <descriptorRef>jar-with-dependencies</descriptorRef>
                    </descriptorRefs>
                    <archive>
                        <manifest>
                            <mainClass>gov.noaa.eds.byExample.trySimpleVfsSftp.App</mainClass>
                        </manifest>
                    </archive>
                </configuration>
            </plugin>
            <plugin>
                <artifactId>maven-antrun-plugin</artifactId>
                <configuration>
                    <tasks>
                        <java classname="gov.noaa.eds.byExample.trySimpleVfsSftp.App" classpathref="maven.runtime.classpath">
                        </java>
                    </tasks>
                </configuration>
            </plugin>
        </plugins>
    </build>
    <dependencies>
        <!-- Supports VFS SFTP -->
        <dependency>
            <groupId>com.jcraft</groupId>
            <artifactId>jsch</artifactId>
            <version>0.1.23</version>
            <optional>true</optional>
        </dependency>
        <dependency>
            <groupId>commons-logging</groupId>
            <artifactId>commons-logging</artifactId>
            <version>1.1</version>
        </dependency>
        <dependency>
            <groupId>commons-vfs</groupId>
            <artifactId>commons-vfs</artifactId>
            <version>1.0</version>
        </dependency>
        <dependency>
            <groupId>junit</groupId>
            <artifactId>junit</artifactId>
            <version>3.8.1</version>
            <scope>test</scope>
        </dependency>
    </dependencies>
</project>

Code Listing

Content of src/main/java/gov/noaa/eds/byExample/trySimpleVfsSftp/App.java (yes, maven does expect the file to be nested down in this directory hierarchy). Be sure to customize the variables near the top of the listing to match your environment if you intend to run this code. The variables to modifiy are host, user, password, remoteDir, filePatternString and localDir.

/*
 * App.java
 */
package gov.noaa.eds.byExample.trySimpleVfsSftp;

import java.io.File;
import java.util.regex.Pattern;
import org.apache.commons.vfs.AllFileSelector;
import org.apache.commons.vfs.FileObject;
import org.apache.commons.vfs.FileSystemException;
import org.apache.commons.vfs.FileSystemManager;
import org.apache.commons.vfs.FileSystemOptions;
import org.apache.commons.vfs.FileType;
import org.apache.commons.vfs.UserAuthenticator;
import org.apache.commons.vfs.VFS;
import org.apache.commons.vfs.auth.StaticUserAuthenticator;
import org.apache.commons.vfs.impl.DefaultFileSystemConfigBuilder;
import org.apache.commons.vfs.impl.DefaultFileSystemManager;
import org.apache.commons.vfs.impl.StandardFileSystemManager;
import org.apache.commons.vfs.provider.local.LocalFile;


/**
 * Example use of VFS sftp
 *
 */
public class App {

    // Set these variables for your testing environment:
    private String host = "sftpremote.example.com";  // Remote SFTP hostname
    private String user = "smokey";      // Remote system login name
    private String password = "bear";    // Remote system password
    private String remoteDir = "/data/source/fires/smoke";
    // Look for a file path like "smoke20070128_wkt.txt"
    private String filePatternString = ".*/smoke\\d{8}_wkt\\.txt";
    // Local directory to receive file
    private String localDir = "/local/received/fires/smoke";
    
    
    private File localDirFile;
    private Pattern filePattern;
    private FileSystemManager fsManager = null;
    private FileSystemOptions opts = null;
    private FileObject sftpFile;

    private FileObject src = null; // used for cleanup in release()

    public static void main(String[] args) {
        System.out.println("SFTP download");
        App app = new App();

        app.initialize();

        app.process();

        app.release();

    } // main( String[] args )


    /**
     * Creates the download directory localDir if it
     * does not exist and makes a connection to the remote SFTP server.
     * 
     */
    public void initialize() {
        if (localDirFile == null) {
            localDirFile = new File(localDir);
        }
        if (!this.localDirFile.exists()) {
            localDirFile.mkdirs();
        }

        try {
            this.fsManager = VFS.getManager();
        } catch (FileSystemException ex) {
            throw new RuntimeException("failed to get fsManager from VFS", ex);
        }

        UserAuthenticator auth = new StaticUserAuthenticator(null, this.user,
                this.password);
        this.opts = new FileSystemOptions();
        try {
            DefaultFileSystemConfigBuilder.getInstance().setUserAuthenticator(opts,
                    auth);
        } catch (FileSystemException ex) {
            throw new RuntimeException("setUserAuthenticator failed", ex);
        }

        this.filePattern = Pattern.compile(filePatternString);
    } // initialize()


    /**
     * Retrieves files that match the specified FileSpec from the SFTP server
     * and stores them in the local directory.
     */
    public void process() {

        String startPath = "sftp://" + this.host + this.remoteDir;
        FileObject[] children;

        // Set starting path on remote SFTP server.
        try {
            this.sftpFile = this.fsManager.resolveFile(startPath, opts);

            System.out.println("SFTP connection successfully established to " +
                    startPath);
        } catch (FileSystemException ex) {
            throw new RuntimeException("SFTP error parsing path " +
                    this.remoteDir,
                    ex);
        }


        // Get a directory listing
        try {
            children = this.sftpFile.getChildren();
        } catch (FileSystemException ex) {
            throw new RuntimeException("Error collecting directory listing of " +
                    startPath, ex);
        }

        search:
        for (FileObject f : children) {
            try {
                String relativePath =
                        File.separatorChar + f.getName().getBaseName();

                if (f.getType() == FileType.FILE) {
                    System.out.println("Examining remote file " + f.getName());

                    if (!this.filePattern.matcher(f.getName().getPath()).matches()) {
                        System.out.println("  Filename does not match, skipping file ." +
                                relativePath);
                        continue search;
                    }

                    String localUrl = "file://" + this.localDir + relativePath;
                    String standardPath = this.localDir + relativePath;
                    System.out.println("  Standard local path is " + standardPath);
                    LocalFile localFile =
                            (LocalFile) this.fsManager.resolveFile(localUrl);
                    System.out.println("    Resolved local file name: " +
                            localFile.getName());

                    if (!localFile.getParent().exists()) {
                        localFile.getParent().createFolder();
                    }

                    System.out.println("  ### Retrieving file ###");
                    localFile.copyFrom(f,
                            new AllFileSelector());
                } else {
                    System.out.println("Ignoring non-file " + f.getName());
                }
            } catch (FileSystemException ex) {
                throw new RuntimeException("Error getting file type for " +
                        f.getName(), ex);
            }
        } // for (FileObject f : children)

        // Set src for cleanup in release()
        src = children[0];
    } // process(Object obj)


    /**
     * Release system resources, close connection to the filesystem. 
     */
    public void release() {
        FileSystem fs = null;

        fs = this.src.getFileSystem(); // This works even if the src is closed.
        this.fsManager.closeFileSystem(fs);
    } // release()
} // class App

Detlev Moerk brought to my attention that there is an alternative method to casting fsManager to DefaultFileSystemManager and then calling close(). The release method could instead use this.fsManager.closeFileSystem(fs);, you would have to declare fs as a FileSystem object in the class and set it from one of the FileObjects in the process routine:

FileObject src = null;

<-- In processing(), do your things, including src = children[0] -->

public void release() {
    FileSystem fs = null;

    this.src.close(); // Seems to still work even if this line is omitted
    fs = this.src.getFileSystem(); // This works even after the src is closed.
    this.fsManager.closeFileSystem(fs);
}

I have incorporated Detlev's suggestion into the example above, because the original release method used "((DefaultFileSystemManager) this.fsManager).close();" which worked once, but if you reinitialize and run the process again you will get an error containing

org.apache.commons.vfs.FileSystemException: Unknown scheme "sftp" in URI "sftp://sftpremote.example.com/data/source/fires/smoke/"

Detlev's method doesn't have this problem.

Compiling

Compile the source code with

mvn assembly:assembly

This will create an executable jar file in the standard target directory.

Running

Use a command like this to run the example

java -jar target/trySimpleVfsSftp-1.0-SNAPSHOT-jar-with-dependencies.jar

Sample Output

If the remote system has the following files in the /data/source/fires/smoke directory:

README.txt

smoke20070328_wkt.txt

smoke20070426_wkt.txt

smoke20070430.txt

The middle two with "_wkt" in their names should be retrieved.

SFTP download
Mar 25, 2008 1:00:44 PM org.apache.commons.vfs.VfsLog info
INFO: Using "/tmp/vfs_cache" as temporary files store.
SFTP connection successfully established to sftp://sftpremote.example.com/data/source/fires/smoke
Examining remote file sftp://sftpremote.example.com/data/source/fires/smoke/README.txt
  Filename does not match, skipping file ./README.txt
Examining remote file sftp://sftpremote.example.com/data/source/fires/smoke/smoke20070328_wkt.txt
  Standard local path is /local/received/fires/smoke/smoke20070328_wkt.txt
    Resolved local file name: file:///local/received/fires/smoke/smoke20070328_wkt.txt
  ### Retrieving file ###
Examining remote file sftp://sftpremote.example.com/data/source/fires/smoke/smoke20070426_wkt.txt
  Standard local path is /local/received/fires/smoke/smoke20070426_wkt.txt
    Resolved local file name: file:///local/received/fires/smoke/smoke20070426_wkt.txt
  ### Retrieving file ###
Examining remote file sftp://sftpremote.example.com/data/source/fires/smoke/smoke20070430.txt
  Filename does not match, skipping file ./smoke20070430.txt

There should now be files matching the filePatternString in the local machine directory "/local/received/fires/smoke".

  • No labels