TEXT vs. VARCHAR

Posted in: Technical Track

On first glance, it looks like TEXT and VARCHAR can store the same information. However, there are fundamental differences between the way TEXT fields and VARCHAR fields work, which are important to take into consideration.

Standard
VARCHAR is actually part of the ISO SQL:2003 standard; The TEXT data types, including TINYTEXT, are non-standard.

Storage
TEXT data types are stored as separate objects from the tables and result sets that contain them. This storage is transparent — there is no difference in how a query involving a TEXT field is written versus one involving a VARCHAR field. Since TEXT is not stored as part of a row, retrieval of TEXT fields requires extra [edited 1/22] memory overhead.


Maximum VARCHAR length
The maximum row length of a VARCHAR is restricted by the maximum row length of a table. This is 65,535 bytes for most storage engines (NDB has a different maximum row value). Theoretically the maximum length of a VARCHAR is 65,536 bytes. Overhead further limits the actual maximum size of a VARCHAR.

Storing the length of a VARCHAR field takes 1 byte if the VARCHAR field has a maximum length of 0-255 bytes; if it is greater than 255 bytes, the overhead to store the length is 2 bytes. If the VARCHAR field allows NULL values, that adds additional overhead — every table uses 1 byte of overhead for each set of 8 fields that allow NULL values. If the VARCHAR is the only row in the table, and does not allow NULL values, the maximum length allowed for VARCHAR is 65,532 bytes.

Keep in mind that that the number in VARCHAR(x) represents number of characters, not number of bytes. Therefore, you may have difficulties trying to define a table with only VARCHAR(65532) if the character set uses multi-byte characters, such as UTF-8.

If you attempt to define a VARCHAR value that is longer than allowed, you will run into an error such as 1118 or 1074:

ERROR 1118 (42000): Row size too large. The maximum row size for the used table type, not counting BLOBs, is 65535. You have to change some columns to TEXT or BLOBs.

ERROR 1074 (42000): Column length too big for column 'col_name' (max=[max number here]); use BLOB or TEXT instead

Maximum TEXT length
The maximum size of a TEXT data type depends on which type of TEXT data type is being used. Because they are stored as objects, the only row overhead in the table object is a pointer (8 or 16 bytes). Here is a list of the maximum TEXT length, and the overhead (in the TEXT object):

    • TINYTEXT – up to 255 bytes, 1 byte overhead

 

    • TEXT – up to 64 Kb, 2 bytes overhead

 

    • MEDIUMTEXT – up to 16 Mb, 3 bytes overhead

 

  • LONGTEXT – up to 4 Gb, 4 bytes overhead

DEFAULT values
MySQL does not allow TEXT data types to have a default value other than NULL. VARCHAR fields are allowed to be created with a DEFAULT value.

Conclusions
Because of the storage implications, it is preferable to use VARCHAR instead of TINYTEXT.

If you need to have a DEFAULT value that is not NULL, you must use VARCHAR (or CHAR).

If you need to store strings longer than approximately 64 Kb, use MEDIUMTEXT or LONGTEXT. VARCHAR cannot support storing values that large.

Make sure you are aware of the effects of a multi-byte character set. VARCHAR(255) stores 255 characters, which may be more than 255 bytes.

Interested in working with Sheeri? Schedule a tech call.

41 Comments. Leave new

Justin Swanhart
January 20, 2010 3:35 pm

TEXT and BLOB columns also consume more memory for sorts.

Reply
Darius Jahandarie
January 20, 2010 7:03 pm

Hello,

One should also note that InnoDB will only store a variable-length column (VARCHAR, VARBINARY, TEXT, BLOB) on the page if the total row length is less than half a page (about eight thousand bytes); otherwise, InnoDB will start moving variable-length columns to off-page storage.

Darius

Reply

hi i am a php beginner with a very ltlite knowledge i have made a simple login form with php/mysql .the thing is whenever i test it on my localhost(XAMPP) it always displays on browser can not select DB plz help me brother i am stuck at this point and really need help

Reply

Darius — while that is true, this article is about the difference between VARCHAR and TEXT types, and InnoDB’s storage of them is the same.

Reply

Hi Sheeri,

I did a long time ago some tests, at a customer with too much memory, where TEXT outperformed VARCHAR significantly (putting tmpdir to a ramdisk to avoid tmp disk tables to physical disk).

Have you investigated in this direction as well? Is there still a performance difference?

Regards,
Oli

Reply

Good Post.
Extending Justin’s comment–
Mysql Doc Reads:
“Instances of BLOB or TEXT columns in the result of a query that is processed using a temporary table causes the server to use a table on disk rather than in memory because the MEMORY storage engine does not support those data types.”
So disk “operations” costs performance.

Reply
Sergei Golubchik
January 21, 2010 3:19 am

So, InnoDB storage of them is the same, MyISAM storage of them is the same. In what engine the storage of VARCHAR and TEXT is different ?

Reply

Sheeri,

so, “storage is the same”, but somehow “extra i/o” is done – you seem to be a bit confused.

Text storage inside MySQL’s memory is different, as it is not part of the row. Now the actual storage is version/engine specific, but in most cases TEXT and VARCHAR storage will be same.

Reply

Domas — I’m not confused — Darius said:

InnoDB only stores VARCHAR and TEXT within the same page if the total row length is about 1/2 a page, otherwise it stores it off-page.

———-
The point of this article was to show how TEXT and VARCHAR are different. I explained in the post that the storage was different — TEXT fields are stored as separate objects (hence extra I/O), VARCHAR fields are stored inline.

Darius was adding to that by saying “in addition, if a TEXT/VARCHAR is bigger than a certain size, it’s stored on a different memory page”. He specifically said it’s done for both TEXT and VARCHAR — the post was about what’s *different*.

So yes, general MySQL object storage is different, as explained in the post, but InnoDB doesn’t have *additional* special storage for TEXT that’s not the same for VARCHAR. This should answer Sergei’s question too.

Reply
Justin Swanhart
January 21, 2010 2:31 pm

Kedar,

At some point somebody wrote a patch that made the memory engine support variable width text..

Anybody else remember that?

Reply
Mark Callaghan
January 21, 2010 4:14 pm

@Justin — that would be the ebay patch. There were plans to get it into mariadb. It would be nice to see it in official mysql. It is a big patch.

Reply
Sergei Golubchik
January 21, 2010 5:59 pm

Sheeri, I’m sorry, but it doesn’t answer my or Domas question.

Darius said that InnoDB treats VARCHAR and TEXT identically, not differently. I can assure you that MyISAM does not store them separately either.

So, this is my question – what engine are you talking about saying that “general MySQL object storage is different [for VARCHAR and TEXT]” ?

Reply

Sergei: From the MySQL Manual:

“Each BLOB or TEXT value is represented internally by a separately allocated object. This is in contrast to all other data types, for which storage is allocated once per column when the table is opened. ”

That’s what I mean. A TEXT value does not sit inline with other row values like INT or DATETIME, it is a “separately allocated object”. A VARCHAR value usually does sit inline with other row values.

(manual reference:
http://dev.mysql.com/doc/refman/5.0/en/blob.html

I checked and it’s also in the 4.1 manual at:

http://dev.mysql.com/doc/refman/4.1/en/blob.html
)

Reply

???? ?? ???????? ??? ??????????????? ???????, ?? ????? ?????? ????????? ??????????????: *) SELECT TABLE_ROWS FROM iomnrfation_schema.TABLES WHERE TABLE_SCHEMA = ‘db_name’ AND TABLE_NAME = ‘table_name’ *) SELECT CARDINALITY FROM iomnrfation_schema.STATISTICS WHERE TABLE_SCHEMA = ‘db_name’ AND TABLE_NAME = ‘table_name’ AND INDEX_NAME = ‘PRIMARY’

Reply
Darius Jahandarie
January 22, 2010 10:06 am

Hello,

VARCHAR and TEXT are represented with different objects internally in MySQL. However, any benefits from that difference are rendered moot when your operations involve reading off-page.

The confusion likely arises from VARCHAR being “part of the row” in the MySQL internals, while in terms of pages and disk, it may be “not part of the row” (on an overflow page).

Not to mention that InnoDB has two different table formats: the old one “Antelope”, and the new one “Barracuda” appearing in plugin with ROW_FORMAT=DYNAMIC. In Antelope, InnoDB will store 768 bytes of the variable length column on-page and a 20 byte pointer to the overflow page. In Barracuda, the longer columns will be entirely stored off-page, with only the pointer on-page.

But what Sheeri said is true, that everything I’m saying has nothing to do with differences between VARCHAR and TEXT. My reason for mentioning it was that the differences between VARCHAR and TEXT aren’t so important in light how the storage engine handles variable-length rows.

Darius

Reply
Ró?nica mi?dzy TEXT i VARCHAR w MySQL » Internet Doktor
January 22, 2010 12:22 pm

[…] informacji na temat kolumn VARCHAR i TEXT mo?na znale?? na blogu Pythian i w dokumentacji […]

Reply
Sergei Golubchik
January 22, 2010 12:28 pm

Sheeri, you are right. BLOBs are, indeed, represented internally by a separately allocated object. That is, in the row image in the MySQL internal row format VARCHAR (and any other type besides BLOBs) sits in the row image. While BLOB is elsewhere, in a separately allocated memory, and in the row image there is only a pointer to the real blob data.

But this applies only and exclusively to how the row is internally represented in *memory* – after the row is retrieved from the storage engine. This description has nothing to do with how the row is stored.

Reply

Ah Sergei — that’s fair, I will change “requires extra I/O overhead” to “requires extra memory overhead”.

Reply

It sure would be great if Vadim over at Percona could benchmark this! I’ll have to nudge him the next time I see him.

Thanks for the great post!

Reply

Yes. I’ve since found this out. My trick worked for the initail data sets that I needed it for, but it obviously doesn’t work as expected for all (most?) datasets.For more than a year now I have revisited this problem off and on and have yet to find a better (read actual ) solution to this need without using code to reorder the results.

Reply
Walkthrough: Porting Asp.Net MVC Website to Mono 2.6.1 and MySql on Linux Apache – Porting to Mono Part 3 of 3 - Nathan Bridgewater
February 24, 2010 11:21 pm

[…] I used the MySql Migration Toolkit (part of the GUI Workbench) to perform my migration. It actually worked out pretty well with the exception of a few data type conversions. I like to use varchar(max) fields in Sql server. I also use bit fields for booleans since they parse right over to C# booleans.  The migration tool didnt’ like varchar(max) and converted them to varchar(-1). It also converted my bit fields to tinyint(4). So during the migration, I was able to make edits to a few of my tables that contained those fields. For varchar(max), you have a choice. You can: use a smaller field like varchar(5000), use a Text field, or you can choose to store your text data in a binary blob and encode/decode it to UTF8.  For this, application, I went the route of selecting a smaller varchar field for simplicity; but if my content pages were large, I would probably go the route of text. I found some interesting discussion in this post on the topic. […]

Reply

So what about storing texts like abstracts or mails, that have 10 000 to 30 000 characters. Should one use VARCHAR(30000) or TEXT ? The most important thing for me is performance, full-text search performance. What column type would be faster? For example i have the ‘articles’ table, with columns: title (1000 chars), abstract (30000 chars), and i want to quickly find particular article(s) in a table like 20 000 000 rows using full-text search.

Reply

I would use TEXT – the maximum theoretical size of varchar is 65,536 bytes and in practice it’s shorter….so if you have VARCHAR(30000) and the character set is utf-8, you won’t be able to create that column. If you’re using a character set that is 1 or 2 bytes per character, you can create it, but I wouldn’t limit myself like that in case you want to convert in the future.

If you want fast fulltext search, using MySQL’s FULLTEXT search is not the best tool — consider using Sphinx or Lucene.

Reply
Michael Oleksak
June 12, 2010 9:43 pm

I have a table that contains a description field set as TEXT in which my clients enter descriptive text in the CMS.

It can contain regular text, and html markup.

My first consideration is to perform a vetical partition on the table so the description field is in a separate table,

My question is, would I benefit (overall site performance ) by converting this field to varchar with no size declaration?

Reply
Sheeri Cabral
June 14, 2010 7:38 am

Michael,

You cannot create a varchar field with no size declaration.

Reply
Michael Oleksak
June 14, 2010 12:08 pm

Thank you Sheeri, I thought about it right after I typed it. What is your opinion on setting the VARCHAR size to 30000 – which in our case should suffice, considering the number of characters being input to that particular field?

Would there be an overall performance benefit?

Reply
Sheeri Cabral
June 14, 2010 1:18 pm

Michael — the “Conclusions” section of the blog post should be pretty clear about that.

Conclusions
Because of the storage implications, it is preferable to use VARCHAR instead of TINYTEXT.

Of course, that’s if that’s possible, you may run into other size limitations (such as row size, or size if you are using a multi-byte character set).

Make sure you are aware of the effects of a multi-byte character set. VARCHAR(255) stores 255 characters, which may be more than 255 bytes.

I will also add that the answer to “would there be an overall performance benefit” is almost always “I can tell you for a general case but the best way to do it would be to get benchmarks for a system with load, then try the change out and take benchmarks again.” I cannot stress having a testing environment enough in these cases.

That being said, I would expect better performance with VARCHAR than with TINYTEXT. However, depending on what you’re doing, the best solution might be something completely different, such as having a different storage engine or putting the text in a different application. (MySQL’s FULLTEXT search isn’t as good as other solutions such as Sphinx and Lucene)

Reply

Since MEMORY engine (think temp tables) maps variable-width types as appropriate fixed-width counterparts, those are obliviously padded to the maximum width of the original column. Does using NULL effectively rule out padding for specific rows thus saving memory or the value still gets padded *with* the flag included?

I assume no padding occurs in case of NULL, and that’s why DEFAULT ” on variable-width types is generally considered bad practice. Am I correct?

Reply

Hi,
From what i read on other websites, TINYTEXT uses less memory, but in long run request can fragment it.
‘i don’t know if its something better.

Reply
khalid mahmood
October 31, 2011 5:21 am

So what about storing texts like abstracts or mails, that have 10 000 to 30 000 characters. Should one use VARCHAR(30000) or TEXT ? The most important thing for me is performance, full-text search performance. What column type would be faster? For example i have the ‘articles’ table, with columns: title (1000 chars), abstract (30000 chars), and i want to quickly find particular article(s) in a table like 20 000 000 rows using full-text search.

Reply

If the VARCHAR is the only row in the table, and does not allow NULL values, the maximum length allowed for VARCHAR is 65,532 bytes.

This sentence is incorrect. It should be “If the VARCHAR is the only COLUMN in the table”, not “row”. If you have multiple varchar columns, you should assume 65,532/(varchar columns) max per row.

Reply

Hi.

I want to know why is there a need of storing the TEXT/BLOB kind of data in a separate object? What is the benefit of that?

Reply
Notes on Rails migrations | Notes to Web
March 23, 2012 9:00 pm

[…] Note: The type :string is used for things that are relatively short, like usernames, title of articles etc. The type :text is for long pieces of text, like the contents of an article. Typically the database implementation might allows as much as gigabytes of text in a field of the :text type. (ref) […]

Reply
christopher musni
October 10, 2012 10:50 pm

Nate Haug – you should assume 65,532/(varchar columns) max per row

this is also incorrect, it should be “you should assume 65,532/(varchar columns) max per columns”

Regarding Varchar ,Just want to confirm if 65,532 is the maximum varchar length/value per row? thanks in advance

Reply

The maximum row size for the used table type, not counting BLOBs, is 65535.Storing the length of a VARCHAR field takes 1 byte if the VARCHAR field has a maximum length of 0-255 bytes. “If it is greater than 255 bytes, the overhead to store the length is 2 bytes”. So I think the maximum length allowed for VARCHAR is 65533,not 65532!!!
like this:
temp>show create table var_t\G
*************************** 1. row ***************************
Table: var_t
Create Table: CREATE TABLE `var_t` (
`a` varchar(65533) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1
1 row in set (0.00 sec)

Reply

yes,that’s right! both 65532(varchar default null) and 65533 (varchar not null) for myisam engine and innodb engine, have nothing whth the storage egine type?

Reply

> TINYTEXT – up to 255 bytes, 1 byte overhead […] Because of the storage implications, it is preferable to use VARCHAR instead of TINYTEXT.

I’m not sure, but I hope that if I crunch a couple more KBs from my floppy disk by using disk compressor, I can take it.

Reply

thanks for sharing. before read your artilce my table in database using TEXT. I’ll change to VARCHAR I hope will increase performace my script.

Reply
down or just me (dojme)
October 17, 2014 1:15 am

I use VARCHAR after I read your article. thanks

Reply
Markella Skempri
December 5, 2014 5:50 am

Could we possibly have an updated version on this opinion for MySQL 5.6?

Reply

n why dont they just offer one standard of text eg text(5) to hold 5 character of text regardless. What is all this varchar,text,char,tinytext???
What happened to keep it simple.

Reply

Leave a Reply

Your email address will not be published. Required fields are marked *