Redis Basics & Data Structures

Introduction#

What is Redis#

Redis is an in-memory high-speed caching database. Full name: Remote Dictionary Server.

Redis is written in C language, a key-value storage system that supports data types such as string, list, set, zset, and hash.

Redis can be used in scenarios such as caching, event publishing or subscribing, high-speed queues, etc. It is memory-based and can be persisted.

Advantages of Redis#

Excellent read and write performance
Rich data types
Atomicity
Rich features (publish/subscribe, notifications, expiration)
Supports persistence (RDB and AOF persistence methods)
Distributed

What Can Redis Do#

Data caching
Time-limited business
Counters (incrby)
Distributed locks (distributed locks implemented based on Redisson)
Rate limiting (using List in Redis as a message queue)
Message queues

Installation/Startup#

Installation#

Centos8 Install Redis🔗

Startup#

# Start in the foreground
redis-server
# Start in the background
brew services start redis
# Stop
brew services stop redis

Connection#

redis-cli

Default port 6379

Redis has 16 databases by default, 0-15

Use select 1 to switch

Basic Commands#

keys * View the keys in the current database (there are 16 databases)
exists key Check if this key exists
type key View the type of this key
del key or unlink key Delete a specific key
- unlink deletes asynchronously
expire key 10 Set the expiration time for the key in seconds
ttl key View how many seconds until the key expires -1 means never expires -2 means already expired
select Switch databases
dbsize View the number of keys in the current database
flushdb Clear all keys in the current database

Basic Types (5 types)#

string#

The string type is binary safe and can contain any data

String is the basic data type of Redis, with a maximum value size of 512Mb

Commands#

set <key> <value>
get <key> View
append <key> <value> Append value after the key
- If this key does not exist, it will be created
strlen <key> Get the length of the value corresponding to this key
setnx <key> <value> Create only if the key does not exist
incr <key> Increment the value of this key by 1
decr <key> Decrement the value by 1
incrby/decrby <key> <num> Add or subtract num from the key's value

These operations are atomic operations and will not be interrupted by other threads during execution, as Redis is single-threaded.

mset <key> <value> <key> <value> <key> <value> Set multiple keys at once
mget <key> <key> <key> Get multiple keys at once
msetnx <key> <value> <key> <value> Set multiple keys at once when the keys do not exist
- Atomic operation: if one fails, all fail
getrange <key> <start> <end> Get all values between and in the key
setrange <key> <offset> <value> Overwrite and set value after offset in the key
setex <key> <expiration time> <value> Set expiration time when setting the key
get set <key> <value> Get the previous value and set a new one after getting

Data Structure#

Arraylist

Use Cases#

Caching: Use Redis as a MySQL cache to reduce MySQL read and write pressure
Counters: The advantage of Redis being single-threaded
Session: Spring session + Redis to achieve session sharing

list#

Single key with multiple values, underlying doubly linked list

Commands#

lpush/rpush <key> <v1> <v2> <v3> Push values from both ends of the doubly linked list
lrange <key> 0 -1 Enumerate the list and get all
rpop/lpop <key> Get and delete values from either end
rpoplpush <k1> <k2> Pop a value from the right of k1 and push it to the left of k2
lindex <key> index Get the value at the index
llen <k> Get the length
linsert before/after <v1> <v2> Insert v2 before/after v1
lrem <key> <n> <v> Delete n values from the left
lset <k> <index> <v> Set the index to v

Data Structure#

When the number of elements is small, it uses ziplist; when there are many elements, it combines multiple ziplists into a quicklist.

Use Cases#

Message queues

set#

Elements in a set cannot be duplicated, underlying hash table, with O(1) complexity for add, delete, and query

Commands#

sadd <k> <v1> <v2> Add
smembers <k> Retrieve all values in the set
sismember <k> <v> Check if this value exists
scard <k> Return the length of this set
srem <k> <v> Delete
spop <k> Randomly take out and delete one
srandmember <k> <n> Randomly take out n without deleting
smove <k1> <k2> <v> Move the value from k1 to k2
sinter <k1> <k2> Return the intersection
sunion <k1> <k2> Return the union
sdiff <k1> <k2> Return the difference

Use Cases#

Likes, shares, favorites, etc.

hash#

Similar to map<string,object> in Java, where the key is the object name and the value contains field and value

Commands#

hset <k> <field> <value> Set key and field-value in the key
hget <k> <field> Get the value in field of k
hmset <k> <field1> <v1> <field2> <v2> Set multiple at once
hexists <k1> <field> Check if the field exists in the key
hkeys <key> Get all fields of the key
hvals <key> Get all values of the key, without fields
hincrby <k> <f> <n> Increment the value of f by n
hsetnx <k> <f> <v> Set value if field does not exist

Data Structure#

When the field-value length is short and the number is small, use ziplist; otherwise, use hashtable.

Use Cases#

Caching

Zset#

Sorted set

Each element has a score, which is used to sort the set; scores can be duplicated

Weighted sorted set, similar to a priority queue in Java, but Redis replaces the queue with a set

Commands#

zadd <k> <score> <k> <score> <k> <score> Add key and score, ranked by score
zrange <k> <start> <stop> withscores Get elements with scores between start and stop
zrangebyscore <k> <min> <max> Return and rank elements with minimum and maximum scores
zincrby <k> <n> <v> Add n to the score of element v
zrem <k> <v> Delete
zcount <k> <min> <max> Count the number of elements between min and max scores
zrank <k> <v> Get the rank of the element, starting from 0

Data Structure#

Hash, skip list

Use Cases#

Leaderboards

Special Types (3 types)#

The following content is learned from this, using code snippets

Original link🔗

HyperLogLogs#

What is Cardinality#

The union of two sets

For example: A = {1,2,3,4,5,6}; B = {5,6,7,8,9}; Cardinality = {1,2,3,4,7,8,9}, which are the unique elements.

Use Cases#

Statistics and counting, used to count registered IP numbers, daily visiting IP numbers, real-time UV of pages, online user counts, etc.

Advantages#

Based on cardinality estimation algorithms, it can estimate cardinality accurately, using a small amount of fixed memory to store and identify unique elements in the set.

This algorithm is not necessarily accurate, with a 0.81% error rate; cardinality statistics are suitable for scenarios that can accept a certain tolerance for error, such as counting visitor numbers.

Commands#

127.0.0.1:6379> pfadd key1 a b c d e f g h i	# Create the first group of elements
(integer) 1
127.0.0.1:6379> pfcount key1					# Count the cardinality of the elements
(integer) 9
127.0.0.1:6379> pfadd key2 c j k l m e g a		# Create the second group of elements
(integer) 1
127.0.0.1:6379> pfcount key2
(integer) 8
127.0.0.1:6379> pfmerge key3 key1 key2			# Merge two groups: key1 key2 -> key3 union
OK
127.0.0.1:6379> pfcount key3
(integer) 13

Bitmap#

Bitmap records information using binary, with only 0 and 1 states.

Use Cases#

Count user login information, logged in and not logged in
Count user like information, liked and not liked
Count employee clock-in information, clocked in and not clocked in

Commands#

Use setbit <key> <offset> <value> to write data, where key can be anything, offset must be an integer, and value can only be 0 or 1.

If you want to count employee clock-in status for a week, key can be set to sign, offset can be set to 0-6, clocked in as 0, not clocked in as 1.

127.0.0.1:6379> setbit sign 0 1
(integer) 0
127.0.0.1:6379> setbit sign 1 1
(integer) 0
127.0.0.1:6379> setbit sign 2 0
(integer) 0
127.0.0.1:6379> setbit sign 3 1
(integer) 0
127.0.0.1:6379> setbit sign 4 0
(integer) 0
127.0.0.1:6379> setbit sign 5 0
(integer) 0
127.0.0.1:6379> setbit sign 6 1
(integer) 0

Use getbit <key> <offset> to check the status of a specific key, returning 0 or 1.

127.0.0.1:6379> getbit sign 3
(integer) 1
127.0.0.1:6379> getbit sign 5
(integer) 0

Use bitcount <key> to count the total number of a certain data.

127.0.0.1:6379> bitcount sign
(integer) 3

geospatial#

Can be used to infer geographical location information, such as the distance between two places.

Commands#

geoadd Add locations.

127.0.0.1:6379> geoadd china:city 118.76 32.04 nanjing 112.55 37.86 taiyuan 123.43 41.80 shenyang
(integer) 3
127.0.0.1:6379> geoadd china:city 144.05 22.52 shengzhen 120.16 30.24 hangzhou 108.96 34.26 xian
(integer) 3

geopos Get the geographical position of specified members.

127.0.0.1:6379> geopos china:city taiyuan nanjing
1) 1) "112.54999905824661255"
   1) "37.86000073876942196"
2) 1) "118.75999957323074341"
   1) "32.03999960287850968"

geodist Get the distance between two places, with units specified.

127.0.0.1:6379> geodist china:city taiyuan shenyang m
"1026439.1070"
127.0.0.1:6379> geodist china:city taiyuan shenyang km
"1026.4391"

georadius

Here is a detailed explanation🔗

Nearby people, obtain the addresses of all nearby people, location, queried by radius.

georadiusbymember

link🔗

Show other members within a certain radius of the specified member.

Data Structures in Redis#

5 basic data structures: string, list, set, hash, zset
3 special data structures: HyperLogLogs, Bitmap, Geospatial

Correspondence#

Image from

Dynamic String SDS#

Redis does not use strings in C but defines its own data structure, called Simple Dynamic String (SDS). The underlying data structure of String in Redis is SDS.

SDS Memory Structure#

len stores the length of the string saved in SDS
buf[] array stores each element of the string
alloc represents the entire SDS in uint8, uint16, uint32, uint64
flags is always a character, with the low three bits indicating the type of the header

Advantages of SDS#

O(1) complexity to get string length: In Redis, to get the string length, you only need to read the len attribute. In C, to get the string length, you need to traverse the entire string, which is O(n) complexity. Use strlen <K> to get the string length.
Prevents buffer overflow: In C, using strcat to concatenate two strings without allocating enough memory can cause buffer overflow. SDS first estimates whether the length after concatenation meets memory requirements based on len. If not, it expands first and then concatenates, thus preventing buffer overflow.
Reduces memory reallocation times for modifying strings: SDS uses len and alloc to implement space pre-allocation and lazy space release.
- Space pre-allocation: When expanding string space, it expands more than needed, thus reducing the number of memory reallocations required when the string length increases. When the string is less than 1M, Redis allocates double the string size; when the string size is greater than 1M, it allocates an additional 1M of space.
- Lazy space release: When the string length decreases, it does not immediately return memory but uses the alloc attribute to record this space for future use. If you feel this will waste space, you can set a timer to release unused space.
Binary safe: In C, a null character is used to indicate the end of a string. However, if a binary file (such as an image) saved in Redis contains a null character, C cannot read it correctly. SDS uses the length indicated by the len attribute to determine whether the string has ended.

ZipList#

Ziplist is a special encoding doubly linked list designed to improve storage efficiency. It can be used to store strings or integers.

Ziplist Memory Structure#

zlbytes stores the total memory bytes occupied by the ziplist
zltail stores the offset of the last entry in the ziplist for quick access to the last entry, facilitating quick pop operations
zllen stores the number of entries in the ziplist.
zlend is the termination byte

Entry Structure#

prevlen size of the previous entry
encoding type and length of the current entry
entry-data entry data

PS: If the entry-data stores integer data, the encoding will combine with entrydata, making the entry structure .

Advantages of Ziplist#

Saves memory

In a regular list, each element occupies the same space size, equal to the size occupied by the largest element in the current list, causing space waste.
Ziplist uses a special data structure to ensure that each element occupies space equal to its own size, reducing space waste.
Traditional lists have contiguous addresses in memory, while ziplist occupies inconsistent sizes, so addresses are not contiguous, requiring the prelen field to save the length of the previous element to solve the traversal problem.

Disadvantages of Ziplist#

Does not reserve memory space; when elements are removed from the list, the unused space is immediately returned, causing each modification of the list to require reallocation of space.

QuickList#

A doubly linked list with ziplist as nodes. How to understand QuickList: List<List<ziplist>> quickList.

Quicklist Memory Structure#

quicklistNode nodes in the list, in quicklist, this node is a ziplist
quickListLZF ziplist compressed with LZ4 algorithm can be wrapped into a quickListLZF structure
quicklistBookmark located at the tail of the quicklist
quicklist
- head, tail point to head and tail pointers
- len indicates the number of nodes in the list
- count indicates the number of entries in the ziplist in the quicklist
- fill controls the maximum space occupied by each ziplist in the quicklist
- compress controls whether to compress each ziplist using the LZ4 algorithm to save space
quicklistIter iterator
quicklistEntry encapsulation of entries in ziplist

Hash Table - Dict - HashTable#

Hash tables are similar to HashTables in most languages, with differences mainly in handling hash collisions and resizing.

Hash Collision#

Using chaining method.

Resizing#

Resizing steps:

Create a new hash table based on the size of the original hash table, with a size double that of the original hash table (shrinking also reduces to half the original size).
Recalculate the index value using the hash algorithm.
After the new table is created and assigned, delete the original table.

Conditions for resizing:

Redis has not executed bgsave or bgrewriteaof commands, and the load factor is greater than or equal to 1.
Redis has not executed bgsave or bgrewriteaof commands, and the load factor is greater than or equal to 5.

Load factor = Number of nodes in the hash table / Size of the hash table.

IntSet#

Intset is one of the underlying implementations of the set type in Redis. When a set contains only integer elements and the number of elements is small, Redis uses intset as the underlying implementation of the set key.

Memory Structure#

encoding encoding method
length stores the number of integers
contents points to the contiguous memory area where the actual values are stored.

Resizing#

Intset is based on an array. When a newly inserted element's size exceeds the size of the largest element in the original array, the entire array needs to be resized, and the space occupied by each element is modified to the size of the largest element.

It will resize but will not shrink!

Skip List#

The purpose of a skip list is similar to that of a red-black tree, for performance. A skip list can complete insertion, search, and deletion in logarithmic time. However, skip lists occupy more space, belonging to a space-for-time data structure.

What is a Skip List#

In a traditional linked list, if you need to query an element, you must traverse the entire list from the head until you find the queried element, with a time complexity of O(n). A skip list reduces the traversal length by adding multiple levels of indexing, thus speeding up the search. A skip list achieves effects similar to balanced trees and hash tables.

Why not use balanced trees but skip lists?

Link🔗

Image from