• Platform Support Change
  • Status: Closed
  • 2 Major
  • Resolution: Fixed
  • ehcache
  • hsingh
  • Reporter: foshea
  • August 23, 2010
  • 1
  • Watchers: 2
  • January 17, 2013
  • August 25, 2010

Attachments

Description

Adding Jira for John Serock while his Jira setup is being fixed.

When attempting to run an application, which uses Ehcache 1.5.0, on z/OS, an exception was thrown that was similar to this:

net.sf.ehcache.CacheException: Error configuring from input stream. Initial cause was Invalid byte 1 of 1-byte UTF-8 sequence.

at net.sf.ehcache.config.ConfigurationFactory.parseConfiguration(ConfigurationFactory.java:155)

at net.sf.ehcache.CacheManager.parseConfiguration(CacheManager.java:266)

at net.sf.ehcache.CacheManager.init(CacheManager.java:220)

at net.sf.ehcache.CacheManager.<init>(CacheManager.java:202)

at my.EhcacheTest.run(EhcacheTest.java:41)

at my.EhcacheTest.main(EhcacheTest.java:25)

Caused by: com.sun.org.apache.xerces.internal.impl.io.MalformedByteSequenceException: Invalid byte 1 of 1-byte UTF-8 sequence.

at com.sun.org.apache.xerces.internal.impl.io.UTF8Reader.invalidByte(Unknown Source)

at com.sun.org.apache.xerces.internal.impl.io.UTF8Reader.read(Unknown Source)

at com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.load(Unknown Source)

at com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.arrangeCapacity(Unknown Source)

at com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.skipString(Unknown Source)

at com.sun.org.apache.xerces.internal.impl.XMLVersionDetector.determineDocVersion(Unknown Source)

at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)

at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)

at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(Unknown Source)

at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(Unknown Source)

at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source)

at javax.xml.parsers.SAXParser.parse(Unknown Source)

at javax.xml.parsers.SAXParser.parse(Unknown Source)

at net.sf.ehcache.config.ConfigurationFactory.parseConfiguration(ConfigurationFactory.java:153)

... 5 more

The default platform encoding on the z/OS system is IBM1047, the Latin-1 character set for EBCDIC hosts. The ehcache.xml configuration file on z/OS is UTF-8 encoded.

The exception appears to be caused by the net.sf.ehcache.config.ConfigurationFactory’s translateSystemProperties method. The Ehcache 1.5.0 version of the method is shown below:

/**

 * Translates system properties which can be added as tokens to the config file using ${token} syntax.

 * <p/>

 * So, if the config file contains a character sequence "multicastGroupAddress=${multicastAddress}", and there is a system property

 * multicastAddress=230.0.0.12 then the translated sequence becomes "multicastGroupAddress=230.0.0.12"

 *

 * @param inputStream

 * @return a translated stream

 */

private static InputStream translateSystemProperties(InputStream inputStream) throws IOException {



    StringBuffer stringBuffer = new StringBuffer();

    int c;

    while ((c = inputStream.read()) != -1) {

        stringBuffer.append((char) c);

    }

    String configuration = stringBuffer.toString();



    Set tokens = extractPropertyTokens(configuration);

    Iterator tokenIterator = tokens.iterator();

    while (tokenIterator.hasNext()) {

        String token = (String) tokenIterator.next();

        String leftTrimmed = token.replaceAll("\\$\\{", "");

        String trimmedToken = leftTrimmed.replaceAll("\\}", "");



        String property = System.getProperty(trimmedToken);

        if (property == null) {

            if (LOG.isDebugEnabled()) {

                LOG.debug("Did not find a system property for the " + token +

                        " token specified in the configuration.Replacing with \"\"");

            }

        } else {

            configuration = configuration.replaceAll("\\$\\{" + trimmedToken + "\\}", property);

            if (LOG.isDebugEnabled()) {

                LOG.debug("Found system property value of " + property + " for the " + token +

                        " token specified in the configuration.");

            }

        }

    }

    return new ByteArrayInputStream(configuration.getBytes());

}

The method (1) reads bytes from an input stream and creates a string from the bytes, (2) translates any system properties found in the string, and (3) converts the string to bytes and creates a input stream using the bytes. When the string is converted to bytes by the String.getBytes() method (i.e., configuration.getBytes() on the last line of the method), the default platform encoding is used to create the byte sequence.

Because the default platform encoding was IBM1047, the String.getBytes() method returned an IBM1047 byte sequence instead of a UTF-8 byte sequence.

Because the ehcache.xml configuration file did not contain an XML prologue specifying an encoding other than UTF-8, the SAX parser threw the MalformedByteSequenceException when it determined that the bytes in the input stream did not comprise a valid UTF-8 sequence.

Comments

Fiona OShea 2010-08-23

Potential solution:

===================

One potential solution is to change the last line of the translateSystemProperties method so that it always returns a UTF-8 sequence:

    return new ByteArrayInputStream(configuration.getBytes("UTF-8"));

The only change is to pass the name of the UTF-8 encoding to the String.getBytes method. This change did resolve the issue when tested.

Fiona OShea 2010-08-23

Additional information:

=======================

I have attached a test application (EhcacheTest.java) and an Ehcache configuration (ehcache.xml) that demonstrates the issue. The exception stack trace shown above was produced on a non-z/OS platform by running the test application with the following JDK VM arguments:

-ea -Dfile.encoding=IBM1047

The test application has dependencies on ehcache-1.5.0.jar, commons-logging-api-1.1.1.jar, backport-util-concurrent.jar.

Ideally, Ehcache 1.5 and higher would be supported on z/OS. We are currently using the Spring Framework version 2.5.6, which supports Ehcache 1.5.0. Eventually, we will migrate to Spring 3.0, which appears to support Ehcache 1.6.2.

Ludovic Orban 2010-08-25

Solved by implementing the proposed fix.

charset conversion errors shouldn’t happen anymore now that encoding and decoding are hardcoded to always match themselves.

Himadri Singh 2010-11-03

Verified w/ -Dfile.encoding=IBM1047 attached java class.