lestes::lang::cplus::lex::ucn_token_buffer Class Reference

Token buffer. More...

#include <ucn_token_buffer.hh>

Inheritance diagram for lestes::lang::cplus::lex::ucn_token_buffer:

lestes::std::object lestes::std::mem::keystone List of all members.

Public Types

typedef buffer_type::size_type size_type
 Type for size of the buffer.

Public Member Functions

void add_back (const ptr< ucn_token > &item)
 Adds single item to the end.
void advance (size_type len)
 Discards items from the beginning.
ptr< ucn_tokenpeek_front (void) const
 Returns item at the beginning.
ptr< token_valueextract_until (ucn stop)
 Extracts value until stop character without interpreting.
ptr< token_valueextract_ordinary (size_type len)
 Extracts value without interpreting.
ptr< token_valueextract_simple_ucn (size_type len, bool identifier)
 Extracts identifiers and numbers with ucn.
ptr< token_valueextract_invalid_ucn (size_type len, bool identifier)
 Extracts identifiers and numbers with bad ucn.
ptr< token_valueextract_ucn_literal (size_type len)
 Extracts literal with escaped ucn.
ptr< token_valueextract_bad_literal (size_type len)
 Extracts malformed literal.
size_type length (void) const
 Returns length of the buffer.

Static Public Member Functions

static ptr< ucn_token_buffercreate (const ptr< line_control > &a_lines)
 Returns new buffer.

Protected Member Functions

 ucn_token_buffer (const ptr< line_control > &a_lines)
 Creates empty buffer.
virtual void gc_mark (void)
 Marks the object.

Private Types

typedef list< srp< ucn_token > > buffer_type
 Type of buffer to hold stored data.

Private Member Functions

 ucn_token_buffer (const ucn_token_buffer &)
 Hides copy constructor.
ucn_token_bufferoperator= (const ucn_token_buffer &)
 Hides assignment operator.

Private Attributes

srp< buffer_typebuffer
 Buffer to hold stored data.
srp< line_controllines
 Line control to transform locations.

Detailed Description

Token buffer.

Represents flexible buffer holding ucn_token items. Items are added mostly one by one at the end and afterwards removed from the beginning in longer runs, forming literals and identifiers.

Definition at line 59 of file ucn_token_buffer.hh.


Member Typedef Documentation

typedef list< srp<ucn_token> > lestes::lang::cplus::lex::ucn_token_buffer::buffer_type [private]

Type of buffer to hold stored data.

Definition at line 62 of file ucn_token_buffer.hh.

typedef buffer_type::size_type lestes::lang::cplus::lex::ucn_token_buffer::size_type

Type for size of the buffer.

Author:
TMA

Definition at line 68 of file ucn_token_buffer.hh.


Constructor & Destructor Documentation

lestes::lang::cplus::lex::ucn_token_buffer::ucn_token_buffer ( const ptr< line_control > &  a_lines  )  [protected]

Creates empty buffer.

Creates empty ucn token buffer.

Postcondition:
length() == 0
Precondition:
a_lines != NULL
Parameters:
a_lines The associated line control.

Definition at line 56 of file ucn_token_buffer.cc.

Referenced by create().

00056                                                                   :
00057         buffer(buffer_type::create()),
00058         lines(checked(a_lines))
00059 {
00060 }

lestes::lang::cplus::lex::ucn_token_buffer::ucn_token_buffer ( const ucn_token_buffer  )  [private]

Hides copy constructor.


Member Function Documentation

void lestes::lang::cplus::lex::ucn_token_buffer::add_back ( const ptr< ucn_token > &  item  ) 

Adds single item to the end.

Adds single item to the end of the buffer.

Parameters:
item The item to be added.

Definition at line 66 of file ucn_token_buffer.cc.

References buffer.

00067 {
00068         buffer->push_back(item);
00069 }

void lestes::lang::cplus::lex::ucn_token_buffer::advance ( ucn_token_buffer::size_type  len  ) 

Discards items from the beginning.

Discards len items from the beginning. Used for skipping unimportant values.

Precondition:
len <= length()
Parameters:
len The length of the discarded sequence.

Definition at line 77 of file ucn_token_buffer.cc.

References buffer, lassert, and length().

00078 {
00079         lassert(len <= length());
00080         while (len != 0) {
00081                 buffer->pop_front();
00082                 len--;
00083         }
00084 }

ptr< ucn_token > lestes::lang::cplus::lex::ucn_token_buffer::peek_front ( void   )  const

Returns item at the beginning.

Returns item currently at the beginning of the buffer.

Precondition:
buffer.size() != 0
Returns:
The item at the beginning of the buffer.

Definition at line 91 of file ucn_token_buffer.cc.

References buffer, lassert, and length().

00092 {
00093         lassert(length() != 0);
00094         return buffer->front();
00095 }

ptr< token_value > lestes::lang::cplus::lex::ucn_token_buffer::extract_until ( ucn  stop  ) 

Extracts value until stop character without interpreting.

Removes items from the beginning of the buffer until the stop value is encountered. That value is not removed. Returns token value representing the removed items. Performs no scanning for ucn or ucn escape sequences. Used for hchar and qchar sequences.

Precondition:
The stop value is present in the sequence.
Parameters:
stop The value to stop scanning the sequence.
Returns:
The representation of the items' values.

Definition at line 135 of file ucn_token_buffer.cc.

References buffer, lestes::lang::cplus::lex::token_value::create(), lassert2, length(), and u.

00136 {
00137         ucn_token_buffer::size_type len = length();
00138         // reserve space
00139         ucn_string us(len,0xbeef);
00140         ucn u;
00141         
00142         buffer_type::iterator bit = buffer->begin();
00143         ucn_string::iterator sit = us.begin();
00144         for (ucn_token_buffer::size_type i = 0; i < len; i++, ++bit, ++sit) {
00145                 u = (*bit)->value_get();
00146                 if (u == stop) {
00147                         // erase removed items
00148                         buffer->erase(buffer->begin(),bit);
00149                         // return only the valid part of the string
00150                         return token_value::create(ucn_string(us.begin(),sit));
00151                 }
00152                 *sit = u;
00153         }
00154 
00155         lassert2(false,"The stop value was not found");
00156         return NULL; 
00157 }

ptr< token_value > lestes::lang::cplus::lex::ucn_token_buffer::extract_ordinary ( ucn_token_buffer::size_type  len  ) 

Extracts value without interpreting.

Removes len items from the beginning of the buffer. Returns token value representing the removed items. Performs no scanning for ucn or ucn escape sequences. Used for identifiers with no ucn in any form and literals without escaped ucn.

Precondition:
len <= length()
Parameters:
len The length of the removed sequence.
Returns:
The representation of the items' values.

Definition at line 106 of file ucn_token_buffer.cc.

References buffer, lestes::lang::cplus::lex::token_value::create(), lassert, and length().

00107 {
00108         lassert(len <= length());
00109         
00110         // reserve space
00111         ucn_string us(len,0xbeef);
00112         
00113         buffer_type::iterator bit = buffer->begin();
00114         ucn_string::iterator sit = us.begin();
00115         for (ucn_token_buffer::size_type i = 0; i < len; i++, ++bit, ++sit) {
00116                 *sit = (*bit)->value_get();
00117         }
00118         
00119         // erase removed items
00120         buffer->erase(buffer->begin(),bit);
00121 
00122         return token_value::create(us);
00123 }

ptr< token_value > lestes::lang::cplus::lex::ucn_token_buffer::extract_simple_ucn ( ucn_token_buffer::size_type  len,
bool  identifier 
)

Extracts identifiers and numbers with ucn.

Removes len items from the beginning of the buffer. Returns token value representing the removed items. Performs scanning for ucn escape sequences, issues errors for ucn invalid in identifiers. Useful for identifiers and preprocessing numbers containing escaped or translated ucn characters.

Precondition:
len <= length()

Only contains backslash in ucn escape sequences.

The contained ucn escape sequences are well-formed.

Parameters:
len The length of the removed sequence.
identifier Flag set to true if extracting identifier.
Returns:
The representation of the items' values, with converted ucn escape sequences.

Definition at line 291 of file ucn_token_buffer.cc.

References BEGIN, buffer, lestes::lang::cplus::lex::token_value::create(), lassert, length(), lines, lestes::report, lestes::lang::cplus::lex::ucn_token::TOK_BASIC, lestes::lang::cplus::lex::ucn_token::TOK_TRANSLATED, u, lestes::lang::cplus::lex::ucn_escape_value_invalid, and lestes::lang::cplus::lex::ucn_escape_value_invalid_in_identifier.

00292 {
00293         lassert(len <= length());
00294         
00295         // state of the function
00296         enum {
00297                 BEGIN,
00298                 BACK,
00299                 UCN
00300         } fstate = BEGIN;
00301 
00302         ulint count = 0xbad, value = 0xbad;
00303         ptr<ucn_token> t;
00304         ptr<simple_location> loc;
00305         ucn_token_type utt;
00306         ucn u;
00307         
00308         // reserve space, the representation can only be shorter
00309         ucn_string us(len,0xbeef);
00310         
00311         buffer_type::iterator bit = buffer->begin();
00312         ucn_string::iterator sit = us.begin();
00313         for (ucn_token_buffer::size_type i = 0; i < len; i++, ++bit) {
00314                 t = *bit;
00315                 loc = t->location_get();
00316                 utt = t->type_get();
00317                 u = t->value_get();
00318 
00319                 switch (fstate) {
00320                         case BEGIN:
00321                                 if (utt == ucn_token::TOK_BASIC) {
00322                                         if (u == character::ascii_backslash) {
00323                                                 fstate = BACK;
00324                                                 break;
00325                                         }
00326                                 } else {
00327                                         lassert(utt == ucn_token::TOK_TRANSLATED);
00328                                         if (identifier && !character::is_translated_identifier(u)) {
00329                                                 // value out of range for identifier
00330                                                 report << ucn_escape_value_invalid_in_identifier << lines->translate_location(loc);
00331                                         }
00332                                 }
00333                                 *sit = u;
00334                                 ++sit;
00335                                 break;
00336                         case BACK:
00337                                 // expecting only ucn escape sequences
00338                                 lassert(utt == ucn_token::TOK_BASIC && 
00339                                                 (u == character::ascii_lower_u || u == character::ascii_upper_u));
00340                                 count = u == character::ascii_lower_u ? 4 : 8;
00341                                 value = 0;
00342                                 fstate = UCN;
00343                                 break;
00344                         case UCN:
00345                                 // expecting only hexa digits in the ucn escape sequence
00346                                 lassert(utt == ucn_token::TOK_BASIC && character::is_xdigit(u));
00347                                 value = (value << 4) | (character::extract_xdigit(u) & 0x0f);
00348                                 if (--count == 0) {
00349                                         u = character::create_internal(value);
00350                                   
00351                                         if (!character::is_translated(u)) {
00352                                                 // ucn escape sequence value out of range
00353                                                 report << ucn_escape_value_invalid << lines->translate_location(loc);
00354                                                 // substitute for harmless
00355                                                 u = character::ascii_underscore;
00356                                         } else if (identifier && !character::is_translated_identifier(u)) {
00357                                                 // value out of range for identifier
00358                                                 report << ucn_escape_value_invalid_in_identifier << lines->translate_location(loc);
00359                                                 // keep the character, even if out of range
00360                                         }
00361                                         *sit = u;
00362                                         ++sit;
00363                                         fstate = BEGIN;
00364                                 }
00365                                 break;
00366                 }
00367         }
00368 
00369         // shall not be partial ucn escape sequence
00370         lassert(fstate == BEGIN);
00371         
00372         // erase removed items
00373         buffer->erase(buffer->begin(),bit);
00374 
00375         // use only filled part of the string
00376         return token_value::create(ucn_string(us.begin(),sit));
00377 }

ptr< token_value > lestes::lang::cplus::lex::ucn_token_buffer::extract_invalid_ucn ( ucn_token_buffer::size_type  len,
bool  identifier 
)

Extracts identifiers and numbers with bad ucn.

Removes len items from the beginning of the buffer. Returns token value representing the removed items. Performs scanning for ucn escape sequences, issues errors for invalid ucn and ucn invalid in identifiers. Used for identifiers and preprocessing numbers containing translated or (possibly unterminated) escaped ucn characters.

Precondition:
len <= length()

Only contains backslash in ucn escape sequences.

Backslash is always followed by `U' or `u'.

Parameters:
len The length of the removed sequence.
identifier Flag set to true if extracting identifier.
Returns:
The representation of the items' values, with converted ucn escape sequences.

Definition at line 173 of file ucn_token_buffer.cc.

References BEGIN, buffer, lestes::lang::cplus::lex::token_value::create(), lassert, length(), lines, lestes::report, lestes::lang::cplus::lex::ucn_token::TOK_BASIC, lestes::lang::cplus::lex::ucn_token::TOK_TRANSLATED, u, lestes::lang::cplus::lex::ucn_escape_insufficient_digits, lestes::lang::cplus::lex::ucn_escape_value_invalid, and lestes::lang::cplus::lex::ucn_escape_value_invalid_in_identifier.

00174 {
00175         lassert(len <= length());
00176         
00177         // state of the function
00178         enum {
00179                 BEGIN,
00180                 BACK,
00181                 UCN
00182         } fstate = BEGIN;
00183 
00184         ulint count = 0xbad, value = 0xbad;
00185         ptr<ucn_token> t;
00186         ptr<simple_location> loc;
00187         ucn_token_type utt;
00188         ucn u;
00189         
00190         // reserve space, the representation can only be shorter
00191         ucn_string us(len,0xbeef);
00192         
00193         buffer_type::iterator bit = buffer->begin();
00194         ucn_string::iterator sit = us.begin();
00195         for (ucn_token_buffer::size_type i = 0; i < len; i++, ++bit) {
00196                 t = *bit;
00197                 utt = t->type_get();
00198                 u = t->value_get();
00199                 loc = t->location_get();
00200                 switch (fstate) {
00201                         case BACK:
00202                                 // expecting only ucn escape sequences
00203                                 lassert(utt == ucn_token::TOK_BASIC && 
00204                                                 (u == character::ascii_lower_u || u == character::ascii_upper_u));
00205                                 count = u == character::ascii_lower_u ? 4 : 8;
00206                                 value = 0;
00207                                 fstate = UCN;
00208                         break;
00209                         case UCN:
00210                                 if (utt == ucn_token::TOK_BASIC && character::is_xdigit(u)) {
00211                                         value = (value << 4) | (character::extract_xdigit(u) & 0x0f);
00212                                         if (--count == 0) {
00213                                                 u = character::create_internal(value);
00214                                           
00215                                                 if (!character::is_translated(u)) {
00216                                                         // ucn escape sequence value out of range
00217                                                         report << ucn_escape_value_invalid << lines->translate_location(loc);
00218                                                         // substitute for harmless
00219                                                         u = character::ascii_underscore;
00220                                                 } else if (identifier && !character::is_translated_identifier(u)) {
00221                                                         // value out of range for identifier
00222                                                         report << ucn_escape_value_invalid_in_identifier << lines->translate_location(loc);
00223                                                         // keep the character, even if out of range
00224                                                 }
00225 
00226                                                 *sit = u;
00227                                                 ++sit;
00228                                                 fstate = BEGIN;
00229                                         }
00230                                         break;
00231                                 } 
00232                                 
00233                                 // malformed ucn escape sequence
00234                                 report << ucn_escape_insufficient_digits << lines->translate_location(loc);
00235                                 
00236                                 // substitute the character and terminate the sequence
00237                                 *sit = character::ascii_underscore;
00238                                 ++sit;
00239                                 fstate = BEGIN;
00240 
00241                                 // fall through
00242                         case BEGIN:
00243                                 if (utt == ucn_token::TOK_BASIC) {
00244                                         if (u == character::ascii_backslash) {
00245                                                 fstate = BACK;
00246                                                 break;
00247                                         }
00248                                 } else {
00249                                         lassert(utt == ucn_token::TOK_TRANSLATED);
00250                                         if (identifier && !character::is_translated_identifier(u)) {
00251                                                 // value out of range for identifier
00252                                                 report << ucn_escape_value_invalid_in_identifier << lines->translate_location(loc);
00253                                         }
00254                                 }
00255                                 *sit = u;
00256                                 ++sit;
00257                                 break;
00258                 }
00259         }
00260 
00261         // shall not end with backslash
00262         lassert(fstate != BACK);
00263 
00264         if (fstate == UCN) {
00265                 // malformed ucn escape sequence
00266                 report << ucn_escape_insufficient_digits << lines->translate_location(loc);
00267                 // substitute the character and terminate the sequence
00268                 *sit = character::ascii_underscore;
00269                 ++sit;
00270         }
00271         
00272         // erase removed items
00273         buffer->erase(buffer->begin(),bit);
00274 
00275         // use only filled part of the string
00276         return token_value::create(ucn_string(us.begin(),sit));
00277 }

ptr< token_value > lestes::lang::cplus::lex::ucn_token_buffer::extract_ucn_literal ( ucn_token_buffer::size_type  len  ) 

Extracts literal with escaped ucn.

Removes len items from the beginning of the buffer. Returns token value representing the removed items. Performs scanning for ucn escape sequences, issues errors for invalid ucn. Useful for well-formed string and character literals.

Precondition:
len <= length()

Only contains well-formed escape sequences.

The contained ucn escape sequences are well-formed.

Parameters:
len The length of the removed sequence.
Returns:
The representation of the items' values, with converted ucn escape sequences.

Definition at line 390 of file ucn_token_buffer.cc.

References BEGIN, buffer, lestes::lang::cplus::lex::token_value::create(), lassert, length(), lines, lestes::report, lestes::lang::cplus::lex::ucn_token::TOK_BASIC, u, and lestes::lang::cplus::lex::ucn_escape_value_invalid.

00391 {
00392         lassert(len <= length());
00393         
00394         // state of the function
00395         enum {
00396                 BEGIN,
00397                 BACK,
00398                 UCN
00399         } fstate = BEGIN;
00400         ulint count = 0xbad, value = 0xbad;
00401         ptr<ucn_token> t;
00402         ucn_token_type utt;
00403         ptr<simple_location> loc;
00404         ucn u;
00405         
00406         // reserve space, the representation can only be shorter
00407         ucn_string us(len,0xbeef);
00408         
00409         buffer_type::iterator bit = buffer->begin();
00410         ucn_string::iterator sit = us.begin();
00411         for (ucn_token_buffer::size_type i = 0; i < len; i++, ++bit) {
00412                  t = *bit;
00413                  loc = t->location_get();
00414                  utt = t->type_get();
00415                  u = t->value_get();
00416                  
00417                  switch (fstate) {
00418                          case BEGIN:
00419                                  if (utt == ucn_token::TOK_BASIC && u == character::ascii_backslash) {
00420                                          fstate = BACK;
00421                                  } else {
00422                                         *sit = u;
00423                                         ++sit;
00424                                  }
00425                                  break;
00426                          case BACK:
00427                                  // expecting only basic characters after backslash
00428                                  lassert(utt == ucn_token::TOK_BASIC);
00429                                  if (u == character::ascii_lower_u || u == character::ascii_upper_u) {
00430                                         count = u == character::ascii_lower_u ? 4 : 8;
00431                                         value = 0;
00432                                         fstate = UCN;
00433                                  } else fstate = BEGIN;
00434                          break;
00435                          case UCN:
00436                                  // expecting only hexa digits in the ucn escape sequence
00437                                  lassert(utt == ucn_token::TOK_BASIC && character::is_xdigit(u));
00438                                  value = (value << 4) | (character::extract_xdigit(u) & 0x0f);
00439                                  if (--count == 0) {
00440                                          u = character::create_internal(value);
00441                                         
00442                                          if (!character::is_translated(u)) {
00443                                                  // ucn escape sequence value out of range
00444                                                  report << ucn_escape_value_invalid << lines->translate_location(loc);
00445                                                  // substitute for harmless
00446                                                  u = character::ascii_underscore;
00447                                          }
00448                                          
00449                                          *sit = u;
00450                                          ++sit;
00451                                          fstate = BEGIN;
00452                                  }
00453                                  break;
00454                  }
00455         }
00456 
00457         // shall not be partial escape sequence
00458         lassert(fstate == BEGIN);
00459         
00460         // erase removed items
00461         buffer->erase(buffer->begin(),bit);
00462 
00463         // use only filled part of the string
00464         return token_value::create(ucn_string(us.begin(),sit));
00465 }

ptr< token_value > lestes::lang::cplus::lex::ucn_token_buffer::extract_bad_literal ( ucn_token_buffer::size_type  len  ) 

Extracts malformed literal.

Removes len items from the beginning of the buffer. Returns token value representing the removed items. Performs scanning for escape sequences, issues errors for invalid. Used for character and string literals with ill-formed escape sequences.

Precondition:
len <= length()

The sequence does not end with backslash.

Parameters:
len The length of the removed sequence.
Returns:
The representation of the items' values, with converted ucn escape sequences.

Definition at line 477 of file ucn_token_buffer.cc.

References BEGIN, buffer, lestes::lang::cplus::lex::token_value::create(), lestes::lang::cplus::lex::invalid_escape_sequence, lassert, lassert2, length(), lines, lestes::lang::cplus::lex::missing_hexadecimal_digits, lestes::report, lestes::lang::cplus::lex::ucn_token::TOK_BASIC, u, lestes::lang::cplus::lex::ucn_escape_insufficient_digits, and lestes::lang::cplus::lex::ucn_escape_value_invalid.

00478 {
00479         lassert(len <= length());
00480         
00481         // state of the function
00482         enum {
00483                 BEGIN,
00484                 PASS,
00485                 BACK,
00486                 UCN,
00487                 OCT,
00488                 HEX
00489         } fstate = BEGIN;
00490         
00491         ulint count = 0xbad, value = 0xbad;
00492         ptr<ucn_token> t;
00493         ptr<simple_location> loc;
00494         ucn_token_type utt = 0xbad;
00495         ucn u = 0xbad;
00496         
00497         // reserve space
00498         ucn_string us(len,0xbeef);
00499         
00500         buffer_type::iterator bit = buffer->begin();
00501         ucn_string::iterator sit = us.begin();
00502         ucn_token_buffer::size_type i = 0;
00503         while (true) {
00504 
00505                 if (fstate == PASS) {
00506                         fstate = BEGIN;
00507                 } else if (i < len) {
00508                         t = *bit;
00509                         ++bit;
00510                         ++i;
00511                         loc = t->location_get();
00512                         utt = t->type_get();
00513                         u = t->value_get();
00514                 } else break;
00515 
00516                 switch (fstate) {
00517                         case BEGIN:
00518                                 if (utt == ucn_token::TOK_BASIC && u == character::ascii_backslash) {
00519                                         fstate = BACK;
00520                                 } else {
00521                                   *sit = u;
00522                                   ++sit;
00523                                 }
00524                                 break;
00525                         case BACK:
00526                                 if (utt == ucn_token::TOK_BASIC) {
00527                                         switch (u) {
00528                                                 case character::ascii_lower_u:
00529                                                         count = 4;
00530                                                         fstate = UCN;
00531                                                         break;
00532                                                 case character::ascii_upper_u:
00533                                                         count = 8;
00534                                                         fstate = UCN;
00535                                                         break;
00536                                                 case character::ascii_lower_x:
00537                                                         count = 1;
00538                                                         fstate = HEX;
00539                                                         break;
00540                                                 case character::ascii_quote:
00541                                                 case character::ascii_dquote:
00542                                                 case character::ascii_qmark:
00543                                                 case character::ascii_backslash:
00544                                                 case character::ascii_lower_a:
00545                                                 case character::ascii_lower_b:
00546                                                 case character::ascii_lower_f:
00547                                                 case character::ascii_lower_n:
00548                                                 case character::ascii_lower_r:
00549                                                 case character::ascii_lower_t:
00550                                                 case character::ascii_lower_v:
00551                                                         *sit = character::ascii_backslash;
00552                                                         ++sit;
00553                                                         *sit = u;
00554                                                         ++sit;
00555                                                         fstate = BEGIN;
00556                                                         break;
00557                                                 default:
00558                                                   if (character::is_odigit(u)) {
00559                                                           *sit = character::ascii_backslash;
00560                                                           ++sit;
00561                                                           *sit = u;
00562                                                           ++sit;
00563                                                           count = 2;
00564                                                           fstate = OCT;
00565                                                   } else {
00566                                                           // unknown escape sequence
00567                                                           report << invalid_escape_sequence << lines->translate_location(loc);
00568                                                           *sit = character::ascii_underscore;
00569                                                           ++sit;
00570                                                           fstate = BEGIN;
00571                                                   }
00572                                         }
00573                                 } else {
00574                                         // bad character after backslash
00575                                         report << invalid_escape_sequence << lines->translate_location(loc);
00576                                         *sit = character::ascii_underscore;
00577                                         ++sit;
00578                                         fstate = BEGIN;
00579                                 }
00580                                 break;
00581                         case UCN:
00582                                 // expecting only hexa digits in the ucn escape sequence
00583                                 if (utt != ucn_token::TOK_BASIC || !character::is_xdigit(u)) {
00584                                         // bad ucn escape sequence
00585                                         report << ucn_escape_insufficient_digits << lines->translate_location(loc);
00586                                         *sit = character::ascii_underscore;
00587                                         ++sit;
00588                                         // keep the character for parsing
00589                                         fstate = PASS;
00590                                         break;
00591                                 }
00592 
00593                                 value = (value << 4) | (character::extract_xdigit(u) & 0x0f);
00594                                 if (--count == 0) {
00595                                         u = character::create_internal(value);
00596                                         if (!character::is_translated(u)) {
00597                                                 // disallowed ucn range
00598                                                 report << ucn_escape_value_invalid << lines->translate_location(loc);
00599                                                 *sit = character::ascii_underscore;
00600                                                 ++sit;
00601                                         } else {
00602                                                 *sit = u;
00603                                                 ++sit;
00604                                         }
00605                                         fstate = BEGIN;
00606                                 }
00607                                 break;
00608                         case OCT:
00609                                 if (utt == ucn_token::TOK_BASIC && character::is_odigit(u)) {
00610                                         *sit = u;
00611                                         ++sit;
00612                                         if (--count == 0) fstate = BEGIN;
00613                                 } else fstate = PASS;
00614                                 break;
00615                         case HEX:
00616                                 if (utt == ucn_token::TOK_BASIC && character::is_xdigit(u)) {
00617                                         if (count) {
00618                                                 count = 0;
00619                                                 *sit = character::ascii_backslash;
00620                                                 ++sit;
00621                                                 *sit = character::ascii_lower_x;
00622                                                 ++sit;
00623                                         }
00624                                         *sit = u;
00625                                         ++sit;
00626                                 } else {
00627                                         if (count) {
00628                                                 // \x with no xdigits
00629                                                 report << missing_hexadecimal_digits << lines->translate_location(loc);
00630                                                 *sit = character::ascii_underscore;
00631                                                 ++sit;
00632                                         }
00633                                         // keep the character for parsing
00634                                         fstate = PASS;
00635                                 }
00636                                 break;
00637                         default:
00638                                 lassert2(false,"You should never get here");
00639                 }
00640         }
00641 
00642         switch (fstate) {
00643                 case BEGIN:
00644                 case OCT:
00645                         break;
00646                 case HEX:
00647                         if (count) {
00648                                 // \x with no xdigits
00649                                 report << missing_hexadecimal_digits << lines->translate_location(loc);
00650                                 *sit = character::ascii_underscore;
00651                                 ++sit;
00652                         }
00653                         break;
00654                 case UCN:
00655                         // unterminated \uU
00656                         report << ucn_escape_insufficient_digits << lines->translate_location(loc);
00657                         *sit = character::ascii_underscore;
00658                         ++sit;
00659                         break;
00660                 default:
00661                         lassert2(false,"You should never get here");
00662         }
00663         
00664         // erase removed items
00665         buffer->erase(buffer->begin(),bit);
00666 
00667         // use only filled part of the string
00668         return token_value::create(ucn_string(us.begin(),sit));
00669 }

ucn_token_buffer::size_type lestes::lang::cplus::lex::ucn_token_buffer::length ( void   )  const

Returns length of the buffer.

Returns length of the buffer.

Returns:
The current length of the buffer.

Definition at line 675 of file ucn_token_buffer.cc.

References buffer.

Referenced by advance(), extract_bad_literal(), extract_invalid_ucn(), extract_ordinary(), extract_simple_ucn(), extract_ucn_literal(), extract_until(), and peek_front().

00676 {
00677         return buffer->size();
00678 }

ptr< ucn_token_buffer > lestes::lang::cplus::lex::ucn_token_buffer::create ( const ptr< line_control > &  a_lines  )  [static]

Returns new buffer.

Retruns new empty ucn token buffer.

Postcondition:
length() == 0
Parameters:
a_lines The associated line control.
Returns:
The new ucn token buffer.

Definition at line 695 of file ucn_token_buffer.cc.

References ucn_token_buffer().

00696 {
00697         return new ucn_token_buffer(a_lines);
00698 }

void lestes::lang::cplus::lex::ucn_token_buffer::gc_mark ( void   )  [protected, virtual]

Marks the object.

Marks the object.

Reimplemented from lestes::std::mem::keystone.

Definition at line 683 of file ucn_token_buffer.cc.

References buffer, and lines.

00684 {
00685         buffer.gc_mark();
00686         lines.gc_mark();
00687 }

ucn_token_buffer& lestes::lang::cplus::lex::ucn_token_buffer::operator= ( const ucn_token_buffer  )  [private]

Hides assignment operator.


Member Data Documentation

srp<buffer_type> lestes::lang::cplus::lex::ucn_token_buffer::buffer [private]

Buffer to hold stored data.

Definition at line 98 of file ucn_token_buffer.hh.

Referenced by add_back(), advance(), extract_bad_literal(), extract_invalid_ucn(), extract_ordinary(), extract_simple_ucn(), extract_ucn_literal(), extract_until(), gc_mark(), length(), and peek_front().

srp<line_control> lestes::lang::cplus::lex::ucn_token_buffer::lines [private]

Line control to transform locations.

Definition at line 100 of file ucn_token_buffer.hh.

Referenced by extract_bad_literal(), extract_invalid_ucn(), extract_simple_ucn(), extract_ucn_literal(), and gc_mark().


The documentation for this class was generated from the following files:
Generated on Mon Feb 12 18:24:19 2007 for lestes by doxygen 1.5.1-20070107