WebDAV write problems, loses files with no error, no consistency

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

WebDAV write problems, loses files with no error, no consistency

Bruce Edge
I¹m not seeing all the files I¹m writing out to webdav from a bundle
thread and I¹m losing files.
It does a simple zip unpack. If I redirect the output to a non-webdav
folder it¹s fine.

I am more invested in using Oliver¹s work here
https://issues.apache.org/jira/browse/SLING-4223 as a long term solution,
but I¹m still concerned that this doesn¹t work as I will be needing some
WebDAV functionality.

The results are also inconsistent. The output file set is never complete.
It always fails in one of 3 ways:
1) loses files with no error
2) fails to update a timestamp
3) fails to create a directory

Even if I use the same input over and over, the output is different every
time. I¹m using a simple unzip loop.

        public void extractAll(File inputFile, File outputDir) throws IOException
{
                FileInputStream ifs = new FileInputStream(inputFile.getAbsolutePath());
                try {
                        if(outputDir.canWrite() == false) {
                                throw new IOException("No write permissions for: " +
outputDir.getAbsolutePath());
                        }
                        ZipInputStream zipInputStream = new ZipInputStream(ifs);
                        ZipEntry zipEntry = zipInputStream.getNextEntry();
                        while (zipEntry != null) {
                                if (!zipEntry.isDirectory()) {
                                        String entryName = zipEntry.getName();
                                        String extractName = entryName;
                                        File newFile = new File(outputDir.getAbsolutePath() + "/" +
extractName);
                                        File parentFile = newFile.getParentFile();
                                        if (!parentFile.isDirectory()) {
                                                if(parentFile.mkdirs() == false) {
                                                        throw new IOException("Failed to create parentFile folder: " +
parentFile.getAbsolutePath());
                                                }
                                        }
                                        FileOutputStream outputStream = new FileOutputStream(newFile);
                                        try {
                                                IOUtils.copy(zipInputStream, outputStream);
                                                if( ! newFile.exists()) {
                                                        throw new IOException("Failed to extract file: " +
newFile.getAbsolutePath());
                                                }
                                        }
                                        finally {
                                                outputStream.close();
                                        }
                                        if (zipEntry.getTime() > 0) {
                                                if( ! newFile.setLastModified(zipEntry.getTime()) ) {
                                                        throw new IOException("Failed to update timestamp for file: " +
newFile.getAbsolutePath());
                                                }
                                        }
                                }
                                zipInputStream.closeEntry();
                                zipEntry = zipInputStream.getNextEntry();
                        }
                }
                finally {
                        ifs.close();
                }
        }
}

If I step through it, right after the:

        IOUtils.copy(zipInputStream, outputStream)
        ...
        outputStream.close();

the file is missing on the WebDAV filesystem.



I filed a jira incident a few days ago showing the lock errors I see with
DEBUG logging enabled.
https://issues.apache.org/jira/browse/SLING-4222

With default logging there are no errors other than the
        SlingRequestProcessorImpl service: Resource
/content/ACT/2014/2014_11_15/IPHONE/b_05_courrier.folio not found
messages when the first write is attempted.

Here¹s a log sample of some of the extraction. The final exception is from
a later stage trying to read a file that was supposedly unpacked
successfully. This is case 1) above, no errors thrown even though not all
files were written.

09.12.2014 22:50:59.303 *INFO* [qtp1099493755-350] logs/request.log
09/Dec/2014:22:50:59 -0800 [1273] -> HEAD
/content/ACT/2014/2014_11_15/IPHONE/b_15_fowler.folio HTTP/1.1
09.12.2014 22:50:59.307 *INFO* [127.0.0.1 [1418194259304] HEAD
/content/ACT/2014/2014_11_15/IPHONE/b_15_fowler.folio HTTP/1.1]
org.apache.sling.engine.impl.SlingRequestProcessorImpl service: Resource
/content/ACT/2014/2014_11_15/IPHONE/b_15_fowler.folio not found
09.12.2014 22:50:59.308 *INFO* [qtp1099493755-350] logs/request.log
09/Dec/2014:22:50:59 -0800 [1273] <- 404 text/html 5ms
09.12.2014 22:50:59.308 *INFO* [qtp1099493755-350] logs/access.log
127.0.0.1 - admin 09/Dec/2014:22:50:59 -0800 "HEAD
/content/ACT/2014/2014_11_15/IPHONE/b_15_fowler.folio HTTP/1.1" 404 3079
"-" "davfs2/1.4.6 neon/0.29.6"
09.12.2014 22:50:59.618 *INFO* [qtp1099493755-350] logs/request.log
09/Dec/2014:22:50:59 -0800 [1274] -> LOCK
/content/ACT/2014/2014_11_15/IPHONE/b_03_edito.folio HTTP/1.1
09.12.2014 22:50:59.625 *INFO* [qtp1099493755-350] logs/request.log
09/Dec/2014:22:50:59 -0800 [1274] <- 201 text/xml; charset=UTF-8 6ms
09.12.2014 22:50:59.625 *INFO* [qtp1099493755-350] logs/access.log
127.0.0.1 - admin 09/Dec/2014:22:50:59 -0800 "LOCK
/content/ACT/2014/2014_11_15/IPHONE/b_03_edito.folio HTTP/1.1" 201 400 "-"
"davfs2/1.4.6 neon/0.29.6"
09.12.2014 22:50:59.625 *INFO* [qtp1099493755-350] logs/request.log
09/Dec/2014:22:50:59 -0800 [1275] -> HEAD
/content/ACT/2014/2014_11_15/IPHONE/b_03_edito.folio HTTP/1.1
09.12.2014 22:50:59.629 *INFO* [127.0.0.1 [1418194259627] HEAD
/content/ACT/2014/2014_11_15/IPHONE/b_03_edito.folio HTTP/1.1]
org.apache.sling.engine.impl.SlingRequestProcessorImpl service: Resource
/content/ACT/2014/2014_11_15/IPHONE/b_03_edito.folio not found
09.12.2014 22:50:59.631 *INFO* [qtp1099493755-350] logs/request.log
09/Dec/2014:22:50:59 -0800 [1275] <- 404 text/html 6ms
09.12.2014 22:50:59.631 *INFO* [qtp1099493755-350] logs/access.log
127.0.0.1 - admin 09/Dec/2014:22:50:59 -0800 "HEAD
/content/ACT/2014/2014_11_15/IPHONE/b_03_edito.folio HTTP/1.1" 404 3067
"-" "davfs2/1.4.6 neon/0.29.6"
09.12.2014 22:50:59.951 *INFO* [qtp1099493755-339] logs/request.log
09/Dec/2014:22:50:59 -0800 [1276] -> LOCK
/content/ACT/2014/2014_11_15/IPHONE/b_05_courrier.folio HTTP/1.1
09.12.2014 22:50:59.958 *INFO* [qtp1099493755-339] logs/request.log
09/Dec/2014:22:50:59 -0800 [1276] <- 201 text/xml; charset=UTF-8 7ms
09.12.2014 22:50:59.958 *INFO* [qtp1099493755-339] logs/access.log
127.0.0.1 - admin 09/Dec/2014:22:50:59 -0800 "LOCK
/content/ACT/2014/2014_11_15/IPHONE/b_05_courrier.folio HTTP/1.1" 201 400
"-" "davfs2/1.4.6 neon/0.29.6"
09.12.2014 22:50:59.959 *INFO* [qtp1099493755-339] logs/request.log
09/Dec/2014:22:50:59 -0800 [1277] -> HEAD
/content/ACT/2014/2014_11_15/IPHONE/b_05_courrier.folio HTTP/1.1
09.12.2014 22:50:59.962 *INFO* [127.0.0.1 [1418194259961] HEAD
/content/ACT/2014/2014_11_15/IPHONE/b_05_courrier.folio HTTP/1.1]
org.apache.sling.engine.impl.SlingRequestProcessorImpl service: Resource
/content/ACT/2014/2014_11_15/IPHONE/b_05_courrier.folio not found
09.12.2014 22:50:59.964 *INFO* [qtp1099493755-339] logs/request.log
09/Dec/2014:22:50:59 -0800 [1277] <- 404 text/html 5ms
09.12.2014 22:50:59.965 *INFO* [qtp1099493755-339] logs/access.log
127.0.0.1 - admin 09/Dec/2014:22:50:59 -0800 "HEAD
/content/ACT/2014/2014_11_15/IPHONE/b_05_courrier.folio HTTP/1.1" 404 3103
"-" "davfs2/1.4.6 neon/0.29.6"
09.12.2014 22:51:00.266 *ERROR* [pool-7-thread-13-<main
queue>(incoming/file)] com.nim.ct.dam.ingest.jobs.ImportFileJobConsumer
Exception: java.io.FileNotFoundException:
/mnt/jcr/content/ACT/2014/2014_11_15/IPHONE/Folio.xml (No such file or
directory)
java.io.FileNotFoundException:
/mnt/jcr/content/ACT/2014/2014_11_15/IPHONE/Folio.xml (No such file or
directory)
        at java.io.FileInputStream.open(Native Method)
        at java.io.FileInputStream.<init>(FileInputStream.java:146)
        at
com.nim.content.formats.folio.FolioArchiveReader.readFolio(FolioArchiveRead
er.java:203)
        at
com.nim.content.formats.folio.FolioArchiveReader.unpack(FolioArchiveReader.
java:76)
        at
com.nim.ct.dam.ingest.formats.InputFileInterchangeFolio.ingest(InputFileInt
erchangeFolio.java:74)
        at
com.nim.ct.dam.ingest.jobs.ImportFileJobConsumer.process(ImportFileJobConsu
mer.java:140)
        at
org.apache.sling.event.impl.jobs.JobConsumerManager$JobConsumerWrapper.proc
ess(JobConsumerManager.java:512)
        at
org.apache.sling.event.impl.jobs.queues.AbstractJobQueue$2.run(AbstractJobQ
ueue.java:591)
        at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1
145)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:
615)
        at java.lang.Thread.run(Thread.java:745)


I¹m using the trunk head as of today.

-Bruce

Reply | Threaded
Open this post in threaded view
|

Re: WebDAV write problems, loses files with no error, no consistency

Bertrand Delacretaz
Hi,

On Wed, Dec 10, 2014 at 8:03 AM, Bruce Edge
<[hidden email]> wrote:
> I¹m not seeing all the files I¹m writing out to webdav from a bundle
> thread and I¹m losing files....

To reformulate, IIUC you are running java code that only works with
File objects, and those are actually stored in Sling's JCR repository
because your code works on a WebDAV mounted folder?

If yes, I would stop (in a debugger) when detecting a failure, and
examine the Sling repository at this to see exactly which nodes/files
were created and what their state is. I suspect there might be name
collisions which mean the JCR content is not what you expect. Or
worse, concurrency issues, but our WebDAV stuff is fairly stable and
well tested code so it would be surprising to discover this now.

Doing this processing inside Sling, accessing the data via JCR is
probably more efficient - but you're right that this should also work
under WebDAV.

It's hard to debug your code by reading it here, if you can reduce to
the smallest thing that fails we might be able to help better.

-Bertrand
Reply | Threaded
Open this post in threaded view
|

Re: WebDAV write problems, loses files with no error, no consistency

Bruce Edge
Found the problem. It was the Linux WebDAV driver. Further experimentation from the shell with a WebDAV mount showed flawless operation from OS X for a variety of file sizes and load, while the Linux WebDAV driver (davfs2 version 1.4.6-1ubuntu3) mount exhibited the same problems I’ve been having from my OSGI thread.

I tried switched from using davfs2, e.g.:
mount -t davfs -o gid=sling,rw,uid=sling,username=admin  http://localhost:8090 /mnt/jcr
to using fusedav:
fusedav -p=admin -u=admin http://localhost:8090 /mnt/jcr

and the problems ceased, however the fusedav driver lacked many of the options provided by davfs2. A bit more digging yielded a new davfs2 version based on the new libneon27 WebDAV API [1]:

Installing the newer davfs2 package, 1.5.2-1, as well as disabling locks in  /etc/davfs2/davfs2.conf also fixed the problem. This is preferable to the fusedav option as davfs2 has considerably more functionality and options than fusedav.

WebDAV I/O is now reliable. Have been running load testing for hours now with no errors.

Thanks again for the pointers Bertrand.

[1]  https://launchpad.net/ubuntu/+source/davfs2

-Bruce

From: Bertrand Delacretaz <[hidden email]<mailto:[hidden email]>>
Reply-To: users <[hidden email]<mailto:[hidden email]>>
Date: Wednesday, December 10, 2014 at 1:10 AM
To: users <[hidden email]<mailto:[hidden email]>>
Subject: Re: WebDAV write problems, loses files with no error, no consistency

Hi,

On Wed, Dec 10, 2014 at 8:03 AM, Bruce Edge
<[hidden email]<mailto:[hidden email]>> wrote:
I¹m not seeing all the files I¹m writing out to webdav from a bundle
thread and I¹m losing files....

To reformulate, IIUC you are running java code that only works with
File objects, and those are actually stored in Sling's JCR repository
because your code works on a WebDAV mounted folder?

If yes, I would stop (in a debugger) when detecting a failure, and
examine the Sling repository at this to see exactly which nodes/files
were created and what their state is. I suspect there might be name
collisions which mean the JCR content is not what you expect. Or
worse, concurrency issues, but our WebDAV stuff is fairly stable and
well tested code so it would be surprising to discover this now.

Doing this processing inside Sling, accessing the data via JCR is
probably more efficient - but you're right that this should also work
under WebDAV.

It's hard to debug your code by reading it here, if you can reduce to
the smallest thing that fails we might be able to help better.

-Bertrand

Reply | Threaded
Open this post in threaded view
|

Re: WebDAV write problems, loses files with no error, no consistency

Felix Meschberger-3
Hi Bruce

Thanks for reporting your findings. Do you mind writing a page on the wiki [1] about your experiences ?

Thanks
Felix

[1] https://cwiki.apache.org/confluence/display/SLING

> Am 11.12.2014 um 08:49 schrieb Bruce Edge <[hidden email]>:
>
> Found the problem. It was the Linux WebDAV driver. Further experimentation from the shell with a WebDAV mount showed flawless operation from OS X for a variety of file sizes and load, while the Linux WebDAV driver (davfs2 version 1.4.6-1ubuntu3) mount exhibited the same problems I’ve been having from my OSGI thread.
>
> I tried switched from using davfs2, e.g.:
> mount -t davfs -o gid=sling,rw,uid=sling,username=admin  http://localhost:8090 /mnt/jcr
> to using fusedav:
> fusedav -p=admin -u=admin http://localhost:8090 /mnt/jcr
>
> and the problems ceased, however the fusedav driver lacked many of the options provided by davfs2. A bit more digging yielded a new davfs2 version based on the new libneon27 WebDAV API [1]:
>
> Installing the newer davfs2 package, 1.5.2-1, as well as disabling locks in  /etc/davfs2/davfs2.conf also fixed the problem. This is preferable to the fusedav option as davfs2 has considerably more functionality and options than fusedav.
>
> WebDAV I/O is now reliable. Have been running load testing for hours now with no errors.
>
> Thanks again for the pointers Bertrand.
>
> [1]  https://launchpad.net/ubuntu/+source/davfs2
>
> -Bruce
>
> From: Bertrand Delacretaz <[hidden email]<mailto:[hidden email]>>
> Reply-To: users <[hidden email]<mailto:[hidden email]>>
> Date: Wednesday, December 10, 2014 at 1:10 AM
> To: users <[hidden email]<mailto:[hidden email]>>
> Subject: Re: WebDAV write problems, loses files with no error, no consistency
>
> Hi,
>
> On Wed, Dec 10, 2014 at 8:03 AM, Bruce Edge
> <[hidden email]<mailto:[hidden email]>> wrote:
> I¹m not seeing all the files I¹m writing out to webdav from a bundle
> thread and I¹m losing files....
>
> To reformulate, IIUC you are running java code that only works with
> File objects, and those are actually stored in Sling's JCR repository
> because your code works on a WebDAV mounted folder?
>
> If yes, I would stop (in a debugger) when detecting a failure, and
> examine the Sling repository at this to see exactly which nodes/files
> were created and what their state is. I suspect there might be name
> collisions which mean the JCR content is not what you expect. Or
> worse, concurrency issues, but our WebDAV stuff is fairly stable and
> well tested code so it would be surprising to discover this now.
>
> Doing this processing inside Sling, accessing the data via JCR is
> probably more efficient - but you're right that this should also work
> under WebDAV.
>
> It's hard to debug your code by reading it here, if you can reduce to
> the smallest thing that fails we might be able to help better.
>
> -Bertrand
>

Reply | Threaded
Open this post in threaded view
|

Re: WebDAV write problems, loses files with no error, no consistency

Bertrand Delacretaz
In reply to this post by Bruce Edge
On Thu, Dec 11, 2014 at 8:49 AM, Bruce Edge
<[hidden email]> wrote:
> ...Found the problem. It was the Linux WebDAV driver....

Not surprising, it seems fairly common for WevDAV clients to behave in
funny ways.

That's why I recommend doing things in Sling as much as possible - you
could maybe just use simple HTTP requests to post your content to
Sling, and let the rest happen there, based on JCR observation or
other requests that trigger the required processing.

-Bertrand
Reply | Threaded
Open this post in threaded view
|

Re: WebDAV write problems, loses files with no error, no consistency

Bruce Edge
In reply to this post by Felix Meschberger-3
Done, I created a short wiki page [1]

After further experimentation it came down to the lock setting so I
simplified the results to exclude any unnecessary settings.

Please move if it’s not in the appropriate location.

[1]
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=50233966

-Bruce


From:  Felix Meschberger <[hidden email]>
Reply-To:  users <[hidden email]>
Date:  Thursday, December 11, 2014 at 1:30 AM
To:  users <[hidden email]>
Subject:  Re: WebDAV write problems, loses files with no error, no
consistency


>Hi Bruce
>
>Thanks for reporting your findings. Do you mind writing a page on the
>wiki [1] about your experiences ?
>
>Thanks
>Felix
>
>[1] https://cwiki.apache.org/confluence/display/SLING
>
>> Am 11.12.2014 um 08:49 schrieb Bruce Edge
>><[hidden email]>:
>>
>> Found the problem. It was the Linux WebDAV driver. Further
>>experimentation from the shell with a WebDAV mount showed flawless
>>operation from OS X for a variety of file sizes and load, while the
>>Linux WebDAV driver (davfs2 version 1.4.6-1ubuntu3) mount exhibited the
>>same problems I’ve been having from my OSGI thread.
>>
>> I tried switched from using davfs2, e.g.:
>> mount -t davfs -o gid=sling,rw,uid=sling,username=admin
>>http://localhost:8090 /mnt/jcr
>> to using fusedav:
>> fusedav -p=admin -u=admin http://localhost:8090 /mnt/jcr
>>
>> and the problems ceased, however the fusedav driver lacked many of the
>>options provided by davfs2. A bit more digging yielded a new davfs2
>>version based on the new libneon27 WebDAV API [1]:
>>
>> Installing the newer davfs2 package, 1.5.2-1, as well as disabling
>>locks in  /etc/davfs2/davfs2.conf also fixed the problem. This is
>>preferable to the fusedav option as davfs2 has considerably more
>>functionality and options than fusedav.
>>
>> WebDAV I/O is now reliable. Have been running load testing for hours
>>now with no errors.
>>
>> Thanks again for the pointers Bertrand.
>>
>> [1]  https://launchpad.net/ubuntu/+source/davfs2
>>
>> -Bruce
>>
>> From: Bertrand Delacretaz
>><[hidden email]<mailto:[hidden email]>>
>> Reply-To: users <[hidden email]<mailto:[hidden email]>>
>> Date: Wednesday, December 10, 2014 at 1:10 AM
>> To: users <[hidden email]<mailto:[hidden email]>>
>> Subject: Re: WebDAV write problems, loses files with no error, no
>>consistency
>>
>> Hi,
>>
>> On Wed, Dec 10, 2014 at 8:03 AM, Bruce Edge
>> <[hidden email]<mailto:[hidden email]>>
>>wrote:
>> I¹m not seeing all the files I¹m writing out to webdav from a bundle
>> thread and I¹m losing files....
>>
>> To reformulate, IIUC you are running java code that only works with
>> File objects, and those are actually stored in Sling's JCR repository
>> because your code works on a WebDAV mounted folder?
>>
>> If yes, I would stop (in a debugger) when detecting a failure, and
>> examine the Sling repository at this to see exactly which nodes/files
>> were created and what their state is. I suspect there might be name
>> collisions which mean the JCR content is not what you expect. Or
>> worse, concurrency issues, but our WebDAV stuff is fairly stable and
>> well tested code so it would be surprising to discover this now.
>>
>> Doing this processing inside Sling, accessing the data via JCR is
>> probably more efficient - but you're right that this should also work
>> under WebDAV.
>>
>> It's hard to debug your code by reading it here, if you can reduce to
>> the smallest thing that fails we might be able to help better.
>>
>> -Bertrand
>>

Reply | Threaded
Open this post in threaded view
|

Re: WebDAV write problems, loses files with no error, no consistency

Bruce Edge
In reply to this post by Bertrand Delacretaz


From: Bertrand Delacretaz <[hidden email]<mailto:[hidden email]>>
Reply-To: users <[hidden email]<mailto:[hidden email]>>
Date: Thursday, December 11, 2014 at 1:55 AM
To: users <[hidden email]<mailto:[hidden email]>>
Subject: Re: WebDAV write problems, loses files with no error, no consistency

On Thu, Dec 11, 2014 at 8:49 AM, Bruce Edge
<[hidden email]<mailto:[hidden email]>> wrote:
...Found the problem. It was the Linux WebDAV driver....

Not surprising, it seems fairly common for WevDAV clients to behave in
funny ways.

That's why I recommend doing things in Sling as much as possible - you
could maybe just use simple HTTP requests to post your content to
Sling, and let the rest happen there, based on JCR observation or
other requests that trigger the required processing.

Yes, that's the long term plan. Am planning on extending Oliver's work [1] on the ContentCreator
once it's released.

For now this unblock me and I can use the existing file based library we have already.

[1] http://markmail.org/message/ai5zgl2miizthlan#query:+page:1+mid:rxs3ut6fnnwx3755+state:results

-Bruce