CodingTranslatorStream: un objeto de transmisión que traduce la codificación de caracteres -- # campo con stream camp codereview Relacionados El problema

EncodingTranslatorStream - A stream object that translates character encoding


5
vote

problema

Español

Esta clase es una corriente diseñada para realizar la traducción de codificación de caracteres. Para que lo cree una instancia, pase un flujo de entrada y especifique la codificación de entrada y la codificación de salida deseada, de modo que cuando lea de esta transmisión, se traducirá los datos de origen de un tipo de codificación a otro.

Estoy interesado en una revisión de código general sobre estilo / organización de código, así como para cualquier error aparente y consideraciones de rendimiento

  /// <summary> /// This class is a stream designed to perform character encoding translation from one encoding to another. /// </summary> public class EncodingTranslatorStream : System.IO.Stream {     /// <summary>     /// Input data.  This is the data that well be decoded, and re-encoded in the specified encoding     /// </summary>     private System.IO.Stream strInput_m;      /// <summary>     /// Input stream reader.  This will be responsible for decoding the input bytes into unicode characters based on the specified input encoding     /// </summary>     private StreamReader srInput_m;      /// <summary>     /// Output stream reader.  This will be responsible for encoding unicode characters into bytes based on the specified output encoding     /// </summary>     private StreamWriter swOutput_m;      /// <summary>     /// Holds a stream of bytes, and when read, the bytes are automatically removed from the stream     /// </summary>     private Stream strOut_m;      /// <summary>     /// Constructor.  Specifies the input and output encoding.     /// </summary>     /// <param name="strInput">Input data, that will be decoded and re-encode into the specified output encoding</param>     /// <param name="encodingIn">The input character encoding to use.</param>     /// <param name="encodingOut">Output encoding</param>     /// <remarks>     /// The character encoding is set by the encoding parameter.      /// The StreamReader object attempts to detect the encoding by looking at the first three bytes of the stream. It automatically recognizes UTF-8, little-endian Unicode, and big-endian Unicode text if the file starts with the appropriate byte order marks. Otherwise, the user-provided encoding is used.      /// </remarks>     public EncodingTranslatorStream(System.IO.Stream strInput, Encoding encodingIn, Encoding encodingOut)     {         this.Init(strInput, encodingOut);         this.srInput_m = new StreamReader(strInput, encodingIn);     }      /// <summary>     /// Constructor.  Specifies the input and output encoding, and a byte order mark detection option for the input stream     /// </summary>     /// <param name="strInput">Input data, that will be decoded and re-encode into the specified output encoding</param>     /// <param name="encodingIn">The input character encoding to use.</param>     /// <param name="encodingOut">Output encoding</param>     /// <param name="bDetectInputEncodingFromByteOrderMarks">Indicates whether to look for byte order marks at the beginning of the input stream.</param>     /// <remarks>     /// This constructor initializes the encoding as specified by the encoding parameter.     /// The bDetectInputEncodingFromByteOrderMarks parameter, if true, detects the encoding by looking at the first three bytes of the stream. It automatically recognizes UTF-8, little-endian Unicode, and big-endian Unicode text if the file starts with the appropriate byte order marks. Otherwise, the user-provided encoding is used.     /// </remarks>     public EncodingTranslatorStream(System.IO.Stream strInput, Encoding encodingIn, bool bDetectInputEncodingFromByteOrderMarks, Encoding encodingOut)     {         this.Init(strInput, encodingOut);         this.srInput_m = new StreamReader(strInput, encodingIn, bDetectInputEncodingFromByteOrderMarks);     }      /// <summary>     /// Constructor.  Specifies an output encoding,  and a byte order mark detection option for the input stream     ///      /// </summary>     /// <param name="strInput">Input data, that will be decoded and re-encode into the specified output encoding</param>     /// <param name="encodingOut">Output encoding</param>     /// <param name="bDetectInputEncodingFromByteOrderMarks">Indicates whether to look for byte order marks at the beginning of the input stream.</param>     /// <remarks>     /// This constructor initializes the encoding to UTF8Encoding     /// The detectEncodingFromByteOrderMarks parameter, if true, detects the encoding by looking at the first three bytes of the stream. It automatically recognizes UTF-8, little-endian Unicode, and big-endian Unicode text if the file starts with the appropriate byte order marks. Otherwise, the UTF8Encoding is used. See the Encoding.GetPreamble method for more information.     /// </remarks>     public EncodingTranslatorStream(System.IO.Stream strInput, bool bDetectInputEncodingFromByteOrderMarks, Encoding encodingOut)     {         this.Init(strInput, encodingOut);         this.srInput_m = new StreamReader(strInput, bDetectInputEncodingFromByteOrderMarks);     }      private void Init(Stream strInput, Encoding encodingOut)     {         this.strInput_m = strInput;          //Because the output bytes of an encoding translation can be larger than what we want in a single read, we need         //somewhere to store it         this.strOut_m = new MemoryQueueBufferStream();         //this.strOut_m = new MemoryStream();          this.swOutput_m = new StreamWriter(this.strOut_m, encodingOut);     }      public override bool CanRead     {         get { return this.strInput_m.CanRead; }     }      public override bool CanSeek     {         get { return this.strInput_m.CanSeek; }     }      public override bool CanWrite     {         get { return false; }     }      public override void Flush()     {         this.strInput_m.Flush();     }             /// <summary>     /// Returns the length of the string in bytes.  Note, depending on the encoding type of the stream, the byte length will vary,     /// as characters may require multiple bytes for certain encodings.  Some encodings allow different byte lengths depending on the     /// character.  This function will return the maximum amount of bytes that the string may take, as returning the actual     /// requires processing the entire string which is time and memory consuming.     /// </summary>     public override long Length     {         get         {             //This returns the length of the input stream             return this.strInput_m.Length;         }     }      /// <summary>     /// The actual position in bytes (not characters)     /// </summary>     public override long Position     {         get         {             return this.strInput_m.Position;         }         set         {             this.strInput_m.Position = value;         }     }      /// <summary>     /// Our temporary pool of characters.  This acts as the middle-man when translating encodings.  Bytes are decoded into this as chars, then encoded back into     /// bytes.  We will re-use this cache so we don't have to keep instantiating the array.     /// </summary>     private char[] lstChars_m;      /// <summary>     /// Reads bytes from the stream.  Bytes will be returned in the output encoding specified, regardless of the input encoding     /// </summary>     /// <param name="buffer">Buffer to fill</param>     /// <param name="offset">Start position in the buffer</param>     /// <param name="count">Count of bytes to read and put in the buffer.  Buffer needs to be long enough to accomodate <paramref name="offset"/> + <paramref name="count"/></param>     /// <returns></returns>     public override int Read(byte[] buffer, int offset, int count)     {         if (this.srInput_m.CurrentEncoding.Equals(this.swOutput_m.Encoding))         {             //The encodings are the same,  lets just bypass the translation stuff             return this.strInput_m.Read(buffer, offset, count);         }          //We are reading data in one encodng, and outputing the data using another encoding         //The process is to read bytes from an input stream, decode them, based on a specified encoding,          //to chars which are unicode then encode them to bytes based on a specified encoding         //Note that the number of input bytes may be more or less than the number of output bytes because         //Some encodings are multibyte and some are not.  Even if both encodings are multibyte they still may not         //use the same number of bytes for any given character.          //Validate the parameters passed in         this.ValidateBufferArgs(buffer, offset, count);          int iTotalBytesRead = 0;          //If there are decoded bytes still in the output stream that havent been read,  return them         if (this.strOut_m.Length > 0)         {             //Read from output stream into the read buffer             int iBytesRead = this.strOut_m.Read(buffer, offset, count);             iTotalBytesRead += iBytesRead;              //While there are still bytes to read from the output stream and we have reached our limit              while (iBytesRead > 0 && iTotalBytesRead < count)             {                 iBytesRead = this.strOut_m.Read(buffer, offset + iTotalBytesRead, count);                  iTotalBytesRead += iBytesRead;             }         }          int iRemainingBytesToRead = count - iTotalBytesRead;          //If we still haven't reached our limit         if (iRemainingBytesToRead > 0)         {             //We need to convert our input to chars, so ensure we have a buffer we can re-use, or create a new one             if (this.lstChars_m == null || lstChars_m.Length < count)             {                 //The max number of chars we will need to deal with is the number of bytes we want to read.                 this.lstChars_m = new char[count];             }              //Convert our input bytes to chars.  Reading from our input StreamReader will take care of decoding bytes, from the input stream, into chars.             //Our streams read method accepts a byte count of bytes to return, but the StreamReader requires a char count.  Depending on the input encoding             //specified, there may be more than 1 byte per character.  We don't know exactly how many bytes to read from the input stream, so we will             //use the byte count as the char count.  At most this will read more bytes than we actually want, but that's ok.             int iCharsRead = this.srInput_m.Read(this.lstChars_m, 0, iRemainingBytesToRead);              if (iCharsRead > 0)             {                 //Convert our chars to bytes using the specified output encoding.  Writing to our output stream writer will take care of encoding.                 //Converting chars to bytes may result in more bytes than were requested but because we're writting to an output stream that is a MemoryQueueBufferStream                 //that stream will hold on to the extra bytes, allowing us to only return what was asked for now, and let us return the rest on subsequent calls                 //to this read method.                  long lOutputPosition = this.strOut_m.Position;                 this.swOutput_m.Write(this.lstChars_m, 0, iCharsRead);                 this.swOutput_m.Flush();                  //If we need to go back the pre-write position.                   //MemoryStream position will advance as data is written to it                 //MemoryQueueBufferStream position will not advance as data is written to it                 if (this.strOut_m.CanSeek && this.strOut_m.Position != lOutputPosition)                 {                     this.strOut_m.Position = lOutputPosition;                 }                  //The output stream now contains a series of bytes that we can return.  When we read bytes from the stream, the data will be removed from the stream                 int iBytesRead = this.strOut_m.Read(buffer, offset + iTotalBytesRead, count);                 iTotalBytesRead += iBytesRead;             }         }         return iTotalBytesRead;     }                    public override long Seek(long offset, System.IO.SeekOrigin origin)     {         return this.strInput_m.Seek(offset, origin);     }      public override void SetLength(long value)     {         throw new NotSupportedException("Setting the length of the stream is not supported.");     }      public override void Write(byte[] buffer, int offset, int count)     {         throw new NotSupportedException("Writing to the stream is not supported.");     }       private void ValidateBufferArgs(byte[] buffer, int offset, int count)     {         if (offset < 0)         {             throw new ArgumentOutOfRangeException("offset", "offset must be non-negative");         }         if (count < 0)         {             throw new ArgumentOutOfRangeException("count", "count must be non-negative");         }         if ((buffer.Length - offset) < count)         {             throw new ArgumentException("requested count exceeds available size");         }     } }   
Original en ingles

This class is a stream designed to perform character encoding translation. So you instantiate it, pass an input stream, and specify the input encoding and desired output encoding, so that when you read from this stream, is will translate the source data from one encoding type to another.

I'm interested in a general code review on style/organization of code as well as for any apparent bugs and performance considerations

/// <summary> /// This class is a stream designed to perform character encoding translation from one encoding to another. /// </summary> public class EncodingTranslatorStream : System.IO.Stream {     /// <summary>     /// Input data.  This is the data that well be decoded, and re-encoded in the specified encoding     /// </summary>     private System.IO.Stream strInput_m;      /// <summary>     /// Input stream reader.  This will be responsible for decoding the input bytes into unicode characters based on the specified input encoding     /// </summary>     private StreamReader srInput_m;      /// <summary>     /// Output stream reader.  This will be responsible for encoding unicode characters into bytes based on the specified output encoding     /// </summary>     private StreamWriter swOutput_m;      /// <summary>     /// Holds a stream of bytes, and when read, the bytes are automatically removed from the stream     /// </summary>     private Stream strOut_m;      /// <summary>     /// Constructor.  Specifies the input and output encoding.     /// </summary>     /// <param name="strInput">Input data, that will be decoded and re-encode into the specified output encoding</param>     /// <param name="encodingIn">The input character encoding to use.</param>     /// <param name="encodingOut">Output encoding</param>     /// <remarks>     /// The character encoding is set by the encoding parameter.      /// The StreamReader object attempts to detect the encoding by looking at the first three bytes of the stream. It automatically recognizes UTF-8, little-endian Unicode, and big-endian Unicode text if the file starts with the appropriate byte order marks. Otherwise, the user-provided encoding is used.      /// </remarks>     public EncodingTranslatorStream(System.IO.Stream strInput, Encoding encodingIn, Encoding encodingOut)     {         this.Init(strInput, encodingOut);         this.srInput_m = new StreamReader(strInput, encodingIn);     }      /// <summary>     /// Constructor.  Specifies the input and output encoding, and a byte order mark detection option for the input stream     /// </summary>     /// <param name="strInput">Input data, that will be decoded and re-encode into the specified output encoding</param>     /// <param name="encodingIn">The input character encoding to use.</param>     /// <param name="encodingOut">Output encoding</param>     /// <param name="bDetectInputEncodingFromByteOrderMarks">Indicates whether to look for byte order marks at the beginning of the input stream.</param>     /// <remarks>     /// This constructor initializes the encoding as specified by the encoding parameter.     /// The bDetectInputEncodingFromByteOrderMarks parameter, if true, detects the encoding by looking at the first three bytes of the stream. It automatically recognizes UTF-8, little-endian Unicode, and big-endian Unicode text if the file starts with the appropriate byte order marks. Otherwise, the user-provided encoding is used.     /// </remarks>     public EncodingTranslatorStream(System.IO.Stream strInput, Encoding encodingIn, bool bDetectInputEncodingFromByteOrderMarks, Encoding encodingOut)     {         this.Init(strInput, encodingOut);         this.srInput_m = new StreamReader(strInput, encodingIn, bDetectInputEncodingFromByteOrderMarks);     }      /// <summary>     /// Constructor.  Specifies an output encoding,  and a byte order mark detection option for the input stream     ///      /// </summary>     /// <param name="strInput">Input data, that will be decoded and re-encode into the specified output encoding</param>     /// <param name="encodingOut">Output encoding</param>     /// <param name="bDetectInputEncodingFromByteOrderMarks">Indicates whether to look for byte order marks at the beginning of the input stream.</param>     /// <remarks>     /// This constructor initializes the encoding to UTF8Encoding     /// The detectEncodingFromByteOrderMarks parameter, if true, detects the encoding by looking at the first three bytes of the stream. It automatically recognizes UTF-8, little-endian Unicode, and big-endian Unicode text if the file starts with the appropriate byte order marks. Otherwise, the UTF8Encoding is used. See the Encoding.GetPreamble method for more information.     /// </remarks>     public EncodingTranslatorStream(System.IO.Stream strInput, bool bDetectInputEncodingFromByteOrderMarks, Encoding encodingOut)     {         this.Init(strInput, encodingOut);         this.srInput_m = new StreamReader(strInput, bDetectInputEncodingFromByteOrderMarks);     }      private void Init(Stream strInput, Encoding encodingOut)     {         this.strInput_m = strInput;          //Because the output bytes of an encoding translation can be larger than what we want in a single read, we need         //somewhere to store it         this.strOut_m = new MemoryQueueBufferStream();         //this.strOut_m = new MemoryStream();          this.swOutput_m = new StreamWriter(this.strOut_m, encodingOut);     }      public override bool CanRead     {         get { return this.strInput_m.CanRead; }     }      public override bool CanSeek     {         get { return this.strInput_m.CanSeek; }     }      public override bool CanWrite     {         get { return false; }     }      public override void Flush()     {         this.strInput_m.Flush();     }             /// <summary>     /// Returns the length of the string in bytes.  Note, depending on the encoding type of the stream, the byte length will vary,     /// as characters may require multiple bytes for certain encodings.  Some encodings allow different byte lengths depending on the     /// character.  This function will return the maximum amount of bytes that the string may take, as returning the actual     /// requires processing the entire string which is time and memory consuming.     /// </summary>     public override long Length     {         get         {             //This returns the length of the input stream             return this.strInput_m.Length;         }     }      /// <summary>     /// The actual position in bytes (not characters)     /// </summary>     public override long Position     {         get         {             return this.strInput_m.Position;         }         set         {             this.strInput_m.Position = value;         }     }      /// <summary>     /// Our temporary pool of characters.  This acts as the middle-man when translating encodings.  Bytes are decoded into this as chars, then encoded back into     /// bytes.  We will re-use this cache so we don't have to keep instantiating the array.     /// </summary>     private char[] lstChars_m;      /// <summary>     /// Reads bytes from the stream.  Bytes will be returned in the output encoding specified, regardless of the input encoding     /// </summary>     /// <param name="buffer">Buffer to fill</param>     /// <param name="offset">Start position in the buffer</param>     /// <param name="count">Count of bytes to read and put in the buffer.  Buffer needs to be long enough to accomodate <paramref name="offset"/> + <paramref name="count"/></param>     /// <returns></returns>     public override int Read(byte[] buffer, int offset, int count)     {         if (this.srInput_m.CurrentEncoding.Equals(this.swOutput_m.Encoding))         {             //The encodings are the same,  lets just bypass the translation stuff             return this.strInput_m.Read(buffer, offset, count);         }          //We are reading data in one encodng, and outputing the data using another encoding         //The process is to read bytes from an input stream, decode them, based on a specified encoding,          //to chars which are unicode then encode them to bytes based on a specified encoding         //Note that the number of input bytes may be more or less than the number of output bytes because         //Some encodings are multibyte and some are not.  Even if both encodings are multibyte they still may not         //use the same number of bytes for any given character.          //Validate the parameters passed in         this.ValidateBufferArgs(buffer, offset, count);          int iTotalBytesRead = 0;          //If there are decoded bytes still in the output stream that havent been read,  return them         if (this.strOut_m.Length > 0)         {             //Read from output stream into the read buffer             int iBytesRead = this.strOut_m.Read(buffer, offset, count);             iTotalBytesRead += iBytesRead;              //While there are still bytes to read from the output stream and we have reached our limit              while (iBytesRead > 0 && iTotalBytesRead < count)             {                 iBytesRead = this.strOut_m.Read(buffer, offset + iTotalBytesRead, count);                  iTotalBytesRead += iBytesRead;             }         }          int iRemainingBytesToRead = count - iTotalBytesRead;          //If we still haven't reached our limit         if (iRemainingBytesToRead > 0)         {             //We need to convert our input to chars, so ensure we have a buffer we can re-use, or create a new one             if (this.lstChars_m == null || lstChars_m.Length < count)             {                 //The max number of chars we will need to deal with is the number of bytes we want to read.                 this.lstChars_m = new char[count];             }              //Convert our input bytes to chars.  Reading from our input StreamReader will take care of decoding bytes, from the input stream, into chars.             //Our streams read method accepts a byte count of bytes to return, but the StreamReader requires a char count.  Depending on the input encoding             //specified, there may be more than 1 byte per character.  We don't know exactly how many bytes to read from the input stream, so we will             //use the byte count as the char count.  At most this will read more bytes than we actually want, but that's ok.             int iCharsRead = this.srInput_m.Read(this.lstChars_m, 0, iRemainingBytesToRead);              if (iCharsRead > 0)             {                 //Convert our chars to bytes using the specified output encoding.  Writing to our output stream writer will take care of encoding.                 //Converting chars to bytes may result in more bytes than were requested but because we're writting to an output stream that is a MemoryQueueBufferStream                 //that stream will hold on to the extra bytes, allowing us to only return what was asked for now, and let us return the rest on subsequent calls                 //to this read method.                  long lOutputPosition = this.strOut_m.Position;                 this.swOutput_m.Write(this.lstChars_m, 0, iCharsRead);                 this.swOutput_m.Flush();                  //If we need to go back the pre-write position.                   //MemoryStream position will advance as data is written to it                 //MemoryQueueBufferStream position will not advance as data is written to it                 if (this.strOut_m.CanSeek && this.strOut_m.Position != lOutputPosition)                 {                     this.strOut_m.Position = lOutputPosition;                 }                  //The output stream now contains a series of bytes that we can return.  When we read bytes from the stream, the data will be removed from the stream                 int iBytesRead = this.strOut_m.Read(buffer, offset + iTotalBytesRead, count);                 iTotalBytesRead += iBytesRead;             }         }         return iTotalBytesRead;     }                    public override long Seek(long offset, System.IO.SeekOrigin origin)     {         return this.strInput_m.Seek(offset, origin);     }      public override void SetLength(long value)     {         throw new NotSupportedException("Setting the length of the stream is not supported.");     }      public override void Write(byte[] buffer, int offset, int count)     {         throw new NotSupportedException("Writing to the stream is not supported.");     }       private void ValidateBufferArgs(byte[] buffer, int offset, int count)     {         if (offset < 0)         {             throw new ArgumentOutOfRangeException("offset", "offset must be non-negative");         }         if (count < 0)         {             throw new ArgumentOutOfRangeException("count", "count must be non-negative");         }         if ((buffer.Length - offset) < count)         {             throw new ArgumentException("requested count exceeds available size");         }     } } 
     

Lista de respuestas

4
 
vote
vote
La mejor respuesta
 

constructores

Porque el constructor del StreamReader < Código> (Stream, Encoding) es igual a llamar al constructor sobrecargado (Stream, Encoding, bool)1 CON true Para el parámetro BOOL, debe usar el encadenamiento constructor.

desde el primer enlace:

El objeto StreamReader intenta detectar la codificación mirando los primeros tres bytes del flujo. Reconoce automáticamente la UTF-8, Little-Endian Unicode y Big-Endian Unicode Text si el archivo comienza con las marcas de orden de byte apropiadas. De lo contrario, se utiliza la codificación provista por el usuario.

para que pueda simplificar su código como

  public EncodingTranslatorStream(System.IO.Stream strInput, Encoding encodingIn, Encoding encodingOut)         : this(strInput, encodingIn, true, encodingOut) {}   

CÓDIGO MUERTO COMO //this.strOut_m = new MemoryStream(); debe eliminarse porque solo agrega ruido al código en lugar de cualquier valor.

Lo mismo ocurre con los comentarios que dicen qué se realiza en lugar de por qué se hace algo. Lea esta buena respuesta sobre los comentarios: https://codereview.stacexchange.com/a/90113/29371 < / p>


Posibles problemas

El exagerado Position puede causar problemas si se pasa una subclase de flujo no codificada correctamente al constructor.

Por lo general, un Stream debe lanzar un NotSupportedException si se establece la propiedad 99887776655443388 y la secuencia no se puede buscar. Entonces, si marca CanSeek antes de configurar el (Stream, Encoding, bool)0 ha hecho todo lo que puede hacer para evitar cualquier excepciones. Si se pasa una subclase de la clase (Stream, Encoding, bool)11 , lo que no sigue este patrón, nadie puede culparlo a continuación.


de nuevo el nombramiento

este bloque de código

  (Stream, Encoding, bool)2  

es muy difícil de leer / agarrar a primera vista. La diferencia entre (Stream, Encoding, bool)3 y (Stream, Encoding, bool)4 es extremadamente pequeño.

Debe al menos cambiar (Stream, Encoding, bool)5 a (Stream, Encoding, bool)6 (Stream, Encoding, bool)7 con o sin su postfix de elección donde, ya que no lo usaría si SIEMPRE UTILICE (Stream, Encoding, bool)8


validación

En el método (Stream, Encoding, bool)9 tiene

  true0  

Pero está utilizando los parámetros antes de la verificación si la codificación del 99887766655544332121 es igual a la codificación del 99887766655443322 .

Omitir el cheque (Supongo que el Stream y StreamReader se encarga de esto) o muévalo a la parte superior del método.


lee () método

Básicamente, el método true3 es mucho tiempo. Debe considerar romperlo en métodos más pequeños que sean más fáciles de mantener y también mejor leer.

 

Constructors

Because the constructor of the StreamReader (Stream, Encoding) is equal to calling the overloaded constructor (Stream, Encoding, bool) with true for the bool parameter, you should use constructor chaining.

From the first link:

The StreamReader object attempts to detect the encoding by looking at the first three bytes of the stream. It automatically recognizes UTF-8, little-endian Unicode, and big-endian Unicode text if the file starts with the appropriate byte order marks. Otherwise, the user-provided encoding is used.

So you could simplify your code like

public EncodingTranslatorStream(System.IO.Stream strInput, Encoding encodingIn, Encoding encodingOut)         : this(strInput, encodingIn, true, encodingOut) {} 

Dead code like //this.strOut_m = new MemoryStream(); should be removed because it only adds noise to the code instead of any value.

The same is true for comments which are saying what is done instead of why something is done. Please read this fine answer about comments: https://codereview.stackexchange.com/a/90113/29371


Possible problems

The overridden Position property can cause problems if a not correctly coded subclass of stream is passed to the constructor.

Usually a Stream should throw a NotSupportedException if the Position property is set and the stream is not seekable. So if you check CanSeek before setting the Position you have done everything you can do to prevent any exceptions. If a subclass of the Stream class is passed in, which doesn't follow this pattern, noone can blame you then.


Again the naming

This block of code

if (this.srInput_m.CurrentEncoding.Equals(this.swOutput_m.Encoding)) {     //The encodings are the same,  lets just bypass the translation stuff     return this.strInput_m.Read(buffer, offset, count); } 

is very hard to read/grasp at first glance. The difference between srInput_m and strInput_m is extremely small.

You should at least change sr_Input_m to reader or streamReader with or without your postfix of choice where as I would not use it if you always use this.


Validation

In the Read() method you have

//Validate the parameters passed in this.ValidateBufferArgs(buffer, offset, count);   

but you are using the parameters before the check if the encoding of the used StreamReader equals the encoding of the used StreamWriter.

Either omit the check (I guess the stream and streamreader take care of this) or move it to the top of the method.


Read() method

Basically the Read() method is way to long. You should consider to break it into smaller methods which are easier to maintain and also better to read.

 
 
1
 
vote

En la versión reciente del Código, si ya ha leído ItotalbyTesReer en la variable "Buffer", el tercer parámetro (CUENTA) en la siguiente línea:

  true4  

debe ser "cuenta - itotalbytesead":

  true5  

De lo contrario, no pasará su propia validación en MemoryQueueDBufferstream - ValidateBufferArgs

 

In the recent version of the code, if you already have read iTotalBytesRead into the "buffer" variable, the third parameter (count) in the following line:

iBytesRead = this.strOut_m.Read(buffer, offset + iTotalBytesRead, count); 

must be "count - iTotalBytesRead":

iBytesRead = this.strOut_m.Read(buffer, offset + iTotalBytesRead, count - iTotalBytesRead); 

otherwise it won't pass your own validation in MemoryQueuedBufferStream -- ValidateBufferArgs

 
 
0
 
vote

Revisión reciente:

  true6  
 

Recent revision:

/// <summary> /// This class is a stream designed to perform character encoding translation from one encoding to another. /// </summary> public class EncodingTranslatorStream : System.IO.Stream {     /// <summary>     /// Input data.  This is the data that well be decoded, and re-encoded in the specified encoding     /// </summary>     private System.IO.Stream strInput_m;      /// <summary>     /// Input stream reader.  This will be responsible for decoding the input bytes into unicode characters based on the specified input encoding     /// </summary>     private StreamReader rdrInput_m;      /// <summary>     /// Output stream reader.  This will be responsible for encoding unicode characters into bytes based on the specified output encoding     /// </summary>     private StreamWriter wrtOutput_m;      /// <summary>     /// Holds a stream of bytes, and when read, the bytes are automatically removed from the stream     /// </summary>     private Stream strOut_m;      /// <summary>     /// Constructor.  Specifies the input and output encoding.     /// </summary>     /// <param name="strInput">Input data, that will be decoded and re-encode into the specified output encoding</param>     /// <param name="encodingIn">The input character encoding to use.</param>     /// <param name="encodingOut">Output encoding</param>     /// <remarks>     /// The character encoding is set by the encoding parameter.      /// The StreamReader object attempts to detect the encoding by looking at the first three bytes of the stream. It automatically recognizes UTF-8, little-endian Unicode, and big-endian Unicode text if the file starts with the appropriate byte order marks. Otherwise, the user-provided encoding is used.      /// </remarks>     public EncodingTranslatorStream(System.IO.Stream strInput, Encoding encodingIn, Encoding encodingOut)         : this(strInput, encodingIn, true, encodingOut)     {     }      /// <summary>     /// Constructor.  Specifies the input and output encoding, and a byte order mark detection option for the input stream     /// </summary>     /// <param name="strInput">Input data, that will be decoded and re-encode into the specified output encoding</param>     /// <param name="encodingIn">The input character encoding to use.</param>     /// <param name="encodingOut">Output encoding</param>     /// <param name="bDetectInputEncodingFromByteOrderMarks">Indicates whether to look for byte order marks at the beginning of the input stream.</param>     /// <remarks>     /// This constructor initializes the encoding as specified by the encoding parameter.     /// The bDetectInputEncodingFromByteOrderMarks parameter, if true, detects the encoding by looking at the first three bytes of the stream. It automatically recognizes UTF-8, little-endian Unicode, and big-endian Unicode text if the file starts with the appropriate byte order marks. Otherwise, the user-provided encoding is used.     /// </remarks>     public EncodingTranslatorStream(System.IO.Stream strInput, Encoding encodingIn, bool bDetectInputEncodingFromByteOrderMarks, Encoding encodingOut)     {         this.Init(strInput, encodingOut);         this.rdrInput_m = new StreamReader(strInput, encodingIn, bDetectInputEncodingFromByteOrderMarks);     }      /// <summary>     /// Constructor.  Specifies an output encoding,  and a byte order mark detection option for the input stream     ///      /// </summary>     /// <param name="strInput">Input data, that will be decoded and re-encode into the specified output encoding</param>     /// <param name="encodingOut">Output encoding</param>     /// <param name="bDetectInputEncodingFromByteOrderMarks">Indicates whether to look for byte order marks at the beginning of the input stream.</param>     /// <remarks>     /// This constructor initializes the encoding to UTF8Encoding     /// The detectEncodingFromByteOrderMarks parameter, if true, detects the encoding by looking at the first three bytes of the stream. It automatically recognizes UTF-8, little-endian Unicode, and big-endian Unicode text if the file starts with the appropriate byte order marks. Otherwise, the UTF8Encoding is used. See the Encoding.GetPreamble method for more information.     /// </remarks>     public EncodingTranslatorStream(System.IO.Stream strInput, bool bDetectInputEncodingFromByteOrderMarks, Encoding encodingOut)     {         this.Init(strInput, encodingOut);         this.rdrInput_m = new StreamReader(strInput, bDetectInputEncodingFromByteOrderMarks);     }      private void Init(Stream strInput, Encoding encodingOut)     {         this.strInput_m = strInput;          //Because the output bytes of an encoding translation can be larger than what we want in a single read, we need         //somewhere to store it         this.strOut_m = new MemoryQueueBufferStream();         //this.strOut_m = new MemoryStream();          this.wrtOutput_m = new StreamWriter(this.strOut_m, encodingOut);     }      public override bool CanRead     {         get { return this.strInput_m.CanRead; }     }      public override bool CanSeek     {         get { return this.strInput_m.CanSeek; }     }      public override bool CanWrite     {         get { return false; }     }      public override void Flush()     {         this.strInput_m.Flush();     }             /// <summary>     /// Returns the length of the string in bytes.  Note, depending on the encoding type of the stream, the byte length will vary,     /// as characters may require multiple bytes for certain encodings.  Some encodings allow different byte lengths depending on the     /// character.  This function will return the maximum amount of bytes that the string may take, as returning the actual     /// requires processing the entire string which is time and memory consuming.     /// </summary>     public override long Length     {         get         {             //This returns the length of the input stream             return this.strInput_m.Length;         }     }      /// <summary>     /// The actual position in bytes (not characters)     /// </summary>     public override long Position     {         get         {             if (this.strInput_m.CanSeek)             {                 return this.strInput_m.Position;             }             else              {                 throw new NotSupportedException(string.Format("Input stream ({0}) does not support seeking.", this.strInput_m.GetType().Name));             }         }         set         {             this.strInput_m.Position = value;         }     }      /// <summary>     /// Our temporary pool of characters.  This acts as the middle-man when translating encodings.  Bytes are decoded into this as chars, then encoded back into     /// bytes.  We will re-use this cache so we don't have to keep instantiating the array.     /// </summary>     private char[] lstChars_m;      /// <summary>     /// Reads bytes from the stream.  Bytes will be returned in the output encoding specified, regardless of the input encoding     /// </summary>     /// <param name="buffer">Buffer to fill</param>     /// <param name="offset">Start position in the buffer</param>     /// <param name="count">Count of bytes to read and put in the buffer.  Buffer needs to be long enough to accomodate <paramref name="offset"/> + <paramref name="count"/></param>     /// <returns></returns>     public override int Read(byte[] buffer, int offset, int count)     {         //Validate the parameters passed in         this.ValidateBufferArgs(buffer, offset, count);          if (this.rdrInput_m.CurrentEncoding.Equals(this.wrtOutput_m.Encoding))         {             //The encodings are the same,  lets just bypass the translation stuff             return this.strInput_m.Read(buffer, offset, count);         }          //We are reading data in one encodng, and outputing the data using another encoding         //The process is to read bytes from an input stream, decode them, based on a specified encoding,          //to chars which are unicode then encode them to bytes based on a specified encoding         //Note that the number of input bytes may be more or less than the number of output bytes because         //Some encodings are multibyte and some are not.  Even if both encodings are multibyte they still may not         //use the same number of bytes for any given character.           int iTotalBytesRead = 0;          //If there are decoded bytes still in the output stream that havent been read,  return them         if (this.strOut_m.Length > 0)         {             //Read from output stream into the read buffer             int iBytesRead = this.strOut_m.Read(buffer, offset, count);             iTotalBytesRead += iBytesRead;              //While there are still bytes to read from the output stream and we have reached our limit              while (iBytesRead > 0 && iTotalBytesRead < count)             {                 iBytesRead = this.strOut_m.Read(buffer, offset + iTotalBytesRead, count);                  iTotalBytesRead += iBytesRead;             }         }          int iRemainingBytesToRead = count - iTotalBytesRead;          //If we still haven't reached our limit         if (iRemainingBytesToRead > 0)         {             //We need to convert our input to chars, so ensure we have a buffer we can re-use, or create a new one             if (this.lstChars_m == null || lstChars_m.Length < count)             {                 //The max number of chars we will need to deal with is the number of bytes we want to read.                 this.lstChars_m = new char[count];             }              //Convert our input bytes to chars.  Reading from our input StreamReader will take care of decoding bytes, from the input stream, into chars.             //Our streams read method accepts a byte count of bytes to return, but the StreamReader requires a char count.  Depending on the input encoding             //specified, there may be more than 1 byte per character.  We don't know exactly how many bytes to read from the input stream, so we will             //use the byte count as the char count.  At most this will read more bytes than we actually want, but that's ok.             int iCharsRead = this.rdrInput_m.Read(this.lstChars_m, 0, iRemainingBytesToRead);              if (iCharsRead > 0)             {                 //Convert our chars to bytes using the specified output encoding.  Writing to our output stream writer will take care of encoding.                 //Converting chars to bytes may result in more bytes than were requested but because we're writting to an output stream that is a MemoryQueueBufferStream                 //that stream will hold on to the extra bytes, allowing us to only return what was asked for now, and let us return the rest on subsequent calls                 //to this read method.                  long lOutputPosition = this.strOut_m.Position;                 this.wrtOutput_m.Write(this.lstChars_m, 0, iCharsRead);                 this.wrtOutput_m.Flush();                  //If we need to go back the pre-write position.                   //MemoryStream position will advance as data is written to it                 //MemoryQueueBufferStream position will not advance as data is written to it                 if (this.strOut_m.Position != lOutputPosition)                 {                     if (this.strOut_m.CanSeek)                     {                         this.strOut_m.Position = lOutputPosition;                     }                     else                     {                         throw new NotSupportedException(string.Format("The output stream ({0}) needs to be seeked after it was written to but it does not support that operation.",this.strOut_m.GetType().FullName));                     }                 }                 //The output stream now contains a series of bytes that we can return.  When we read bytes from the stream, the data will be removed from the stream                 int iBytesRead = this.strOut_m.Read(buffer, offset + iTotalBytesRead, count);                 iTotalBytesRead += iBytesRead;             }         }         return iTotalBytesRead;     }                    public override long Seek(long offset, System.IO.SeekOrigin origin)     {         return this.strInput_m.Seek(offset, origin);     }      public override void SetLength(long value)     {         throw new NotSupportedException("Setting the length of the stream is not supported.");     }      public override void Write(byte[] buffer, int offset, int count)     {         throw new NotSupportedException("Writing to the stream is not supported.");     }      private void ValidateBufferArgs(byte[] buffer, int offset, int count)     {         if (offset < 0)         {             throw new ArgumentOutOfRangeException("offset", "offset must be non-negative");         }         if (count < 0)         {             throw new ArgumentOutOfRangeException("count", "count must be non-negative");         }         if ((buffer.Length - offset) < count)         {             throw new ArgumentException("requested count exceeds available size");         }     } } 
 
 
   
   

Relacionados problema

9  Implementando una función de "división" rápida en una corriente de caracteres en Java  ( Implementing a fast split function on a stream of characters in java ) 
Estaba escribiendo un pedazo de código (un código personalizado $fragments5 para SOLR, pero eso no es demasiado relevante) diseñado para romper las líneas de...

2  Envoltura de flujo de respuesta HTTP que se encuentra  ( Seekable http response stream wrapper ) 
He creado este envoltorio para usar junto con zenity1 Streams y zenity2 . zenity3 LEA zenity4 índice Una vez desde el final del archivo, por lo que est...

6  Secuencia de fecha de construcción en Scala  ( Construct date sequence in scala ) 
Quiero tener una secuencia de fecha continua como ['2014-01-01', '2014-01-02', ...] Entonces defino un arroyo para hacer eso. def daySeq(start: Date): St...

9  Entrada de usuario y lectura de contenidos de archivo  ( User input and reading contents of file ) 
Para la divulgación completa: esta es una tarea para mi clase de programación y solo quiero consejos o consejos sobre algunos del código. Detalles de asigna...

2  Función de archivo abierta en C ++  ( Openfile function in c ) 
Estoy en el proceso de aprendizaje de C ++ básico y una de las preguntas de práctica que encontré fue escribir una función para abrir un archivo en C ++. En...

3  Calculadora de impuestos utilizando Java 8 y Bigdecimal  ( Tax calculator using java 8 and bigdecimal ) 
Estoy aprendiendo Java8 y BigDecimal. El siguiente ejercicio involucrado Escribiendo una calculadora de impuestos que tomó un salario y calculó el impuesto. p...

4  Monitoreo de trazas de satisfacción de las propiedades lógicas temporales  ( Monitoring traces for satisfaction of temporal logic properties ) 
Estoy tratando de implementar un monitor para una cierta lógica temporal. Lo que esto significa es lo siguiente: Hay alguna fuente externa, de la que viene...

11  Transferencia de archivos sobre un arroyo  ( File transfer over a stream ) 
Estoy haciendo algo de programación en el lado y me gustaría saber si voy en la dirección correcta con mi código. module Main where import Text.ParserComb...

6  C Getline () Implementación  ( C getline implementation ) 
Estoy practicando mi codificación C y quería implementar mi propia versión del getline Función en C para fines de aprendizaje. Me gustaría una revisión sobr...

3  Constructo: Rechizo Stream  ( Construct rebuffered stream ) 
Esto es parte de la construir biblioteca. Se utiliza rechazado para convertir una transmisión sin nombre / sin amigos, como desde un zócalo o una tubería en...




© 2022 respuesta.top Reservados todos los derechos. Centro de preguntas y respuestas reservados todos los derechos