TEXT vs. VARCHAR

Jan 20, 2010 / By Sheeri Cabral

Tags: ,

On first glance, it looks like TEXT and VARCHAR can store the same information. However, there are fundamental differences between the way TEXT fields and VARCHAR fields work, which are important to take into consideration.

Standard
VARCHAR is actually part of the ISO SQL:2003 standard; The TEXT data types, including TINYTEXT, are non-standard.

Storage
TEXT data types are stored as separate objects from the tables and result sets that contain them. This storage is transparent — there is no difference in how a query involving a TEXT field is written versus one involving a VARCHAR field. Since TEXT is not stored as part of a row, retrieval of TEXT fields requires extra [edited 1/22] memory overhead.


Maximum VARCHAR length
The maximum row length of a VARCHAR is restricted by the maximum row length of a table. This is 65,535 bytes for most storage engines (NDB has a different maximum row value). Theoretically the maximum length of a VARCHAR is 65,536 bytes. Overhead further limits the actual maximum size of a VARCHAR.

Storing the length of a VARCHAR field takes 1 byte if the VARCHAR field has a maximum length of 0-255 bytes; if it is greater than 255 bytes, the overhead to store the length is 2 bytes. If the VARCHAR field allows NULL values, that adds additional overhead — every table uses 1 byte of overhead for each set of 8 fields that allow NULL values. If the VARCHAR is the only row in the table, and does not allow NULL values, the maximum length allowed for VARCHAR is 65,532 bytes.

Keep in mind that that the number in VARCHAR(x) represents number of characters, not number of bytes. Therefore, you may have difficulties trying to define a table with only VARCHAR(65532) if the character set uses multi-byte characters, such as UTF-8.

If you attempt to define a VARCHAR value that is longer than allowed, you will run into an error such as 1118 or 1074:

ERROR 1118 (42000): Row size too large. The maximum row size for the used table type, not counting BLOBs, is 65535. You have to change some columns to TEXT or BLOBs.

ERROR 1074 (42000): Column length too big for column 'col_name' (max=[max number here]); use BLOB or TEXT instead

Maximum TEXT length
The maximum size of a TEXT data type depends on which type of TEXT data type is being used. Because they are stored as objects, the only row overhead in the table object is a pointer (8 or 16 bytes). Here is a list of the maximum TEXT length, and the overhead (in the TEXT object):

    • TINYTEXT – up to 255 bytes, 1 byte overhead

 

    • TEXT – up to 64 Kb, 2 bytes overhead

 

    • MEDIUMTEXT – up to 16 Mb, 3 bytes overhead

 

  • LONGTEXT – up to 4 Gb, 4 bytes overhead

DEFAULT values
MySQL does not allow TEXT data types to have a default value other than NULL. VARCHAR fields are allowed to be created with a DEFAULT value.

Conclusions
Because of the storage implications, it is preferable to use VARCHAR instead of TINYTEXT.

If you need to have a DEFAULT value that is not NULL, you must use VARCHAR (or CHAR).

If you need to store strings longer than approximately 64 Kb, use MEDIUMTEXT or LONGTEXT. VARCHAR cannot support storing values that large.

Make sure you are aware of the effects of a multi-byte character set. VARCHAR(255) stores 255 characters, which may be more than 255 bytes.

37 Responses to “TEXT vs. VARCHAR”

  • TEXT and BLOB columns also consume more memory for sorts.

  • Hello,

    One should also note that InnoDB will only store a variable-length column (VARCHAR, VARBINARY, TEXT, BLOB) on the page if the total row length is less than half a page (about eight thousand bytes); otherwise, InnoDB will start moving variable-length columns to off-page storage.

    Darius

    • Niiezha says:

      hi i am a php beginner with a very ltlite knowledge i have made a simple login form with php/mysql .the thing is whenever i test it on my localhost(XAMPP) it always displays on browser can not select DB plz help me brother i am stuck at this point and really need help

  • Darius — while that is true, this article is about the difference between VARCHAR and TEXT types, and InnoDB’s storage of them is the same.

  • shinguz says:

    Hi Sheeri,

    I did a long time ago some tests, at a customer with too much memory, where TEXT outperformed VARCHAR significantly (putting tmpdir to a ramdisk to avoid tmp disk tables to physical disk).

    Have you investigated in this direction as well? Is there still a performance difference?

    Regards,
    Oli

  • kedar says:

    Good Post.
    Extending Justin’s comment–
    Mysql Doc Reads:
    “Instances of BLOB or TEXT columns in the result of a query that is processed using a temporary table causes the server to use a table on disk rather than in memory because the MEMORY storage engine does not support those data types.”
    So disk “operations” costs performance.

  • Sergei Golubchik says:

    So, InnoDB storage of them is the same, MyISAM storage of them is the same. In what engine the storage of VARCHAR and TEXT is different ?

  • Domas says:

    Sheeri,

    so, “storage is the same”, but somehow “extra i/o” is done – you seem to be a bit confused.

    Text storage inside MySQL’s memory is different, as it is not part of the row. Now the actual storage is version/engine specific, but in most cases TEXT and VARCHAR storage will be same.

  • Domas — I’m not confused — Darius said:

    InnoDB only stores VARCHAR and TEXT within the same page if the total row length is about 1/2 a page, otherwise it stores it off-page.

    ———-
    The point of this article was to show how TEXT and VARCHAR are different. I explained in the post that the storage was different — TEXT fields are stored as separate objects (hence extra I/O), VARCHAR fields are stored inline.

    Darius was adding to that by saying “in addition, if a TEXT/VARCHAR is bigger than a certain size, it’s stored on a different memory page”. He specifically said it’s done for both TEXT and VARCHAR — the post was about what’s *different*.

    So yes, general MySQL object storage is different, as explained in the post, but InnoDB doesn’t have *additional* special storage for TEXT that’s not the same for VARCHAR. This should answer Sergei’s question too.

  • Kedar,

    At some point somebody wrote a patch that made the memory engine support variable width text..

    Anybody else remember that?

  • Mark Callaghan says:

    @Justin — that would be the ebay patch. There were plans to get it into mariadb. It would be nice to see it in official mysql. It is a big patch.

  • Sergei Golubchik says:

    Sheeri, I’m sorry, but it doesn’t answer my or Domas question.

    Darius said that InnoDB treats VARCHAR and TEXT identically, not differently. I can assure you that MyISAM does not store them separately either.

    So, this is my question – what engine are you talking about saying that “general MySQL object storage is different [for VARCHAR and TEXT]” ?

  • Sergei: From the MySQL Manual:

    “Each BLOB or TEXT value is represented internally by a separately allocated object. This is in contrast to all other data types, for which storage is allocated once per column when the table is opened. ”

    That’s what I mean. A TEXT value does not sit inline with other row values like INT or DATETIME, it is a “separately allocated object”. A VARCHAR value usually does sit inline with other row values.

    (manual reference:
    http://dev.mysql.com/doc/refman/5.0/en/blob.html

    I checked and it’s also in the 4.1 manual at:

    http://dev.mysql.com/doc/refman/4.1/en/blob.html
    )

    • Jemima says:

      ???? ?? ???????? ??? ??????????????? ???????, ?? ????? ?????? ????????? ??????????????: *) SELECT TABLE_ROWS FROM iomnrfation_schema.TABLES WHERE TABLE_SCHEMA = ‘db_name’ AND TABLE_NAME = ‘table_name’ *) SELECT CARDINALITY FROM iomnrfation_schema.STATISTICS WHERE TABLE_SCHEMA = ‘db_name’ AND TABLE_NAME = ‘table_name’ AND INDEX_NAME = ‘PRIMARY’

  • Hello,

    VARCHAR and TEXT are represented with different objects internally in MySQL. However, any benefits from that difference are rendered moot when your operations involve reading off-page.

    The confusion likely arises from VARCHAR being “part of the row” in the MySQL internals, while in terms of pages and disk, it may be “not part of the row” (on an overflow page).

    Not to mention that InnoDB has two different table formats: the old one “Antelope”, and the new one “Barracuda” appearing in plugin with ROW_FORMAT=DYNAMIC. In Antelope, InnoDB will store 768 bytes of the variable length column on-page and a 20 byte pointer to the overflow page. In Barracuda, the longer columns will be entirely stored off-page, with only the pointer on-page.

    But what Sheeri said is true, that everything I’m saying has nothing to do with differences between VARCHAR and TEXT. My reason for mentioning it was that the differences between VARCHAR and TEXT aren’t so important in light how the storage engine handles variable-length rows.

    Darius

  • Ró?nica mi?dzy TEXT i VARCHAR w MySQL » Internet Doktor says:

    [...] informacji na temat kolumn VARCHAR i TEXT mo?na znale?? na blogu Pythian i w dokumentacji [...]

  • Sergei Golubchik says:

    Sheeri, you are right. BLOBs are, indeed, represented internally by a separately allocated object. That is, in the row image in the MySQL internal row format VARCHAR (and any other type besides BLOBs) sits in the row image. While BLOB is elsewhere, in a separately allocated memory, and in the row image there is only a pointer to the real blob data.

    But this applies only and exclusively to how the row is internally represented in *memory* – after the row is retrieved from the storage engine. This description has nothing to do with how the row is stored.

  • Ah Sergei — that’s fair, I will change “requires extra I/O overhead” to “requires extra memory overhead”.

  • erin o'neill says:

    It sure would be great if Vadim over at Percona could benchmark this! I’ll have to nudge him the next time I see him.

    Thanks for the great post!

    • Matheus says:

      Yes. I’ve since found this out. My trick worked for the initail data sets that I needed it for, but it obviously doesn’t work as expected for all (most?) datasets.For more than a year now I have revisited this problem off and on and have yet to find a better (read actual ) solution to this need without using code to reorder the results.

  • Walkthrough: Porting Asp.Net MVC Website to Mono 2.6.1 and MySql on Linux Apache – Porting to Mono Part 3 of 3 - Nathan Bridgewater says:

    [...] I used the MySql Migration Toolkit (part of the GUI Workbench) to perform my migration. It actually worked out pretty well with the exception of a few data type conversions. I like to use varchar(max) fields in Sql server. I also use bit fields for booleans since they parse right over to C# booleans.  The migration tool didnt’ like varchar(max) and converted them to varchar(-1). It also converted my bit fields to tinyint(4). So during the migration, I was able to make edits to a few of my tables that contained those fields. For varchar(max), you have a choice. You can: use a smaller field like varchar(5000), use a Text field, or you can choose to store your text data in a binary blob and encode/decode it to UTF8.  For this, application, I went the route of selecting a smaller varchar field for simplicity; but if my content pages were large, I would probably go the route of text. I found some interesting discussion in this post on the topic. [...]

  • Janusz says:

    So what about storing texts like abstracts or mails, that have 10 000 to 30 000 characters. Should one use VARCHAR(30000) or TEXT ? The most important thing for me is performance, full-text search performance. What column type would be faster? For example i have the ‘articles’ table, with columns: title (1000 chars), abstract (30000 chars), and i want to quickly find particular article(s) in a table like 20 000 000 rows using full-text search.

  • I would use TEXT – the maximum theoretical size of varchar is 65,536 bytes and in practice it’s shorter….so if you have VARCHAR(30000) and the character set is utf-8, you won’t be able to create that column. If you’re using a character set that is 1 or 2 bytes per character, you can create it, but I wouldn’t limit myself like that in case you want to convert in the future.

    If you want fast fulltext search, using MySQL’s FULLTEXT search is not the best tool — consider using Sphinx or Lucene.

  • I have a table that contains a description field set as TEXT in which my clients enter descriptive text in the CMS.

    It can contain regular text, and html markup.

    My first consideration is to perform a vetical partition on the table so the description field is in a separate table,

    My question is, would I benefit (overall site performance ) by converting this field to varchar with no size declaration?

  • Michael,

    You cannot create a varchar field with no size declaration.

  • Thank you Sheeri, I thought about it right after I typed it. What is your opinion on setting the VARCHAR size to 30000 – which in our case should suffice, considering the number of characters being input to that particular field?

    Would there be an overall performance benefit?

  • Michael — the “Conclusions” section of the blog post should be pretty clear about that.

    Conclusions
    Because of the storage implications, it is preferable to use VARCHAR instead of TINYTEXT.

    Of course, that’s if that’s possible, you may run into other size limitations (such as row size, or size if you are using a multi-byte character set).

    Make sure you are aware of the effects of a multi-byte character set. VARCHAR(255) stores 255 characters, which may be more than 255 bytes.

    I will also add that the answer to “would there be an overall performance benefit” is almost always “I can tell you for a general case but the best way to do it would be to get benchmarks for a system with load, then try the change out and take benchmarks again.” I cannot stress having a testing environment enough in these cases.

    That being said, I would expect better performance with VARCHAR than with TINYTEXT. However, depending on what you’re doing, the best solution might be something completely different, such as having a different storage engine or putting the text in a different application. (MySQL’s FULLTEXT search isn’t as good as other solutions such as Sphinx and Lucene)

  • xerxes says:

    Since MEMORY engine (think temp tables) maps variable-width types as appropriate fixed-width counterparts, those are obliviously padded to the maximum width of the original column. Does using NULL effectively rule out padding for specific rows thus saving memory or the value still gets padded *with* the flag included?

    I assume no padding occurs in case of NULL, and that’s why DEFAULT ” on variable-width types is generally considered bad practice. Am I correct?

  • OnLine says:

    Hi,
    From what i read on other websites, TINYTEXT uses less memory, but in long run request can fragment it.
    ‘i don’t know if its something better.

  • So what about storing texts like abstracts or mails, that have 10 000 to 30 000 characters. Should one use VARCHAR(30000) or TEXT ? The most important thing for me is performance, full-text search performance. What column type would be faster? For example i have the ‘articles’ table, with columns: title (1000 chars), abstract (30000 chars), and i want to quickly find particular article(s) in a table like 20 000 000 rows using full-text search.

  • Nate Haug says:

    If the VARCHAR is the only row in the table, and does not allow NULL values, the maximum length allowed for VARCHAR is 65,532 bytes.

    This sentence is incorrect. It should be “If the VARCHAR is the only COLUMN in the table”, not “row”. If you have multiple varchar columns, you should assume 65,532/(varchar columns) max per row.

  • Sourav says:

    Hi.

    I want to know why is there a need of storing the TEXT/BLOB kind of data in a separate object? What is the benefit of that?

  • [...] Note: The type :string is used for things that are relatively short, like usernames, title of articles etc. The type :text is for long pieces of text, like the contents of an article. Typically the database implementation might allows as much as gigabytes of text in a field of the :text type. (ref) [...]

  • Nate Haug – you should assume 65,532/(varchar columns) max per row

    this is also incorrect, it should be “you should assume 65,532/(varchar columns) max per columns”

    Regarding Varchar ,Just want to confirm if 65,532 is the maximum varchar length/value per row? thanks in advance

  • xingye says:

    The maximum row size for the used table type, not counting BLOBs, is 65535.Storing the length of a VARCHAR field takes 1 byte if the VARCHAR field has a maximum length of 0-255 bytes. “If it is greater than 255 bytes, the overhead to store the length is 2 bytes”. So I think the maximum length allowed for VARCHAR is 65533,not 65532!!!
    like this:
    temp>show create table var_t\G
    *************************** 1. row ***************************
    Table: var_t
    Create Table: CREATE TABLE `var_t` (
    `a` varchar(65533) NOT NULL
    ) ENGINE=InnoDB DEFAULT CHARSET=latin1
    1 row in set (0.00 sec)

    • sylar says:

      yes,that’s right! both 65532(varchar default null) and 65533 (varchar not null) for myisam engine and innodb engine, have nothing whth the storage egine type?

  • > TINYTEXT – up to 255 bytes, 1 byte overhead [...] Because of the storage implications, it is preferable to use VARCHAR instead of TINYTEXT.

    I’m not sure, but I hope that if I crunch a couple more KBs from my floppy disk by using disk compressor, I can take it.

Leave a Reply

  • (will not be published)

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>