PAT 1097 Deduplication on a Linked List

  1. 1. Deduplication on a Linked List

Deduplication on a Linked List

Given a singly linked list L with integer keys, you are supposed to remove the nodes with duplicated absolute values of the keys. That is, for each value K, only the first node of which the value or absolute value of its key equals K will be kept. At the mean time, all the removed nodes must be kept in a separate list. For example, given L being 21→-15→-15→-7→15, you must output 21→-15→-7, and the removed list -15→15.

Input Specification

Each input file contains one test case. For each case, the first line contains the address of the first node, and a positive N (≤10^​5​​) which is the total number of nodes. The address of a node is a 5-digit nonnegative integer, and NULL is represented by −1.

Then N lines follow, each describes a node in the format:

1
Address Key Next

where Address is the position of the node, Key is an integer of which absolute value is no more than 10​^4​​, and Next is the position of the next node.

Output Specification

For each case, output the resulting linked list first, then the removed list. Each node occupies a line, and is printed in the same format as in the input.

Sample Input

1
2
3
4
5
6
00100 5
99999 -7 87654
23854 -15 00000
87654 15 -1
00000 -15 99999
00100 21 23854

Sample Output

1
2
3
4
5
00100 21 23854
23854 -15 99999
99999 -7 -1
00000 -15 87654
87654 15 -1


类似这样的骚操作网上有很多,这里不在引用。本文主要考虑常规思路。思路非常简单,便利链表,如果曾经出现过就插入另一个链表,没出现过就继续,最后得到两个链表,分别是去重后的和重复的。看似比较简单,想把所有情况考虑全也是要花点心思的。

  • 重复部分的链表是和原链表顺序是一样的,采用尾插法。需要头、尾两个指针。
  • 原链表的头节点是第一个节点,不可能被放入重复链表,那么去重后的链表头节点不变。
  • 遍历过程中需要将节点删除(即插入重复链表),所以需要prev指针记录去重链表的上一个节点地址,以便修改next指针。

如果你像我一样以为上面的描述就把所有的情况都考虑完全了,那么欢迎你和我来到BUG的海洋😉

先来看AC的代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
#include "vector"
#include "iostream"
#include "algorithm"
using namespace std;
struct node
{
int addr;
int key;
int next;
};
node nodes[100086];
bool map[10024];
void print(int head)
{
auto cur = head;
while (cur != -1)
{
if (nodes[cur].next != -1)
printf("%05d %d %05d\n", cur, nodes[cur].key, nodes[cur].next);
else printf("%05d %d -1\n", cur, nodes[cur].key);
cur = nodes[cur].next;
}
}
int main()
{
int head, n;
cin >> head >> n;
for (int i = 0; i < n; i++)
{
int id;
cin >> id;
nodes[id].addr = id;
cin >> nodes[id].key >> nodes[id].next;
}
fill(map, map + 10024, false);
int dupli_head = -1;
int dupli_tail = -1;
int cur = head;
int prev = -1;
while (cur != -1)
{
while (map[abs(nodes[cur].key)])
{
int next = nodes[cur].next;
if (dupli_head == -1) dupli_head = dupli_tail = cur;
else
{
nodes[dupli_tail].next = cur;
dupli_tail = cur;
nodes[cur].next = -1;
}
cur = next;
if (cur == -1) break;
}
if (cur == -1) { nodes[prev].next = -1; break; }
if (prev != -1) nodes[prev].next = cur;
map[abs(nodes[cur].key)] = true;
prev = cur;
cur = nodes[cur].next;
}
print(head);
print(dupli_head);
return 0;
}

为啥while循环里又套了个while循环,用if不就够了么?

此乃第一坑😉
上文说到

prev指针记录去重链表的上一个节点地址

那么prevnext应该是去重链表的下一个地址。而if得到的仅仅是下一个地址,此时还不知道他是否属于去重链表。这时不能做prev.next=cur。那什么时候可以呢?当然是map[abs(nodes[cur].key)为假啦。

好多-1看的我头晕

恭喜你,进入了第二坑😉

  1. dupli_headdupli_tail是头尾指针,-1作为链表为空时的特殊值。

  2. 插入dupli链表后,要把cur.next置为-1,表示这是新链表的最后一个节点。为防止链表断裂,用next=cur.next记住下一个地址。

  3. 插入重复链表是在while循环里进行的,那么就有可能在这里耗光所有的元素。此即if (cur == -1) break;

  4. 当在while循环里耗尽所有元素后,已经不可能有新的元素可以作为prev.next,因此if (cur == -1) { nodes[prev].next = -1; break; }

  5. 上文说过,原链表的头节点一定是去重链表的头节点。那么当刚访问过头节点时,是不需要对prev操作的。即if (prev != -1) nodes[prev].next = cur;。当然,你也可以直接跳过头节点
    1
    2
    3
    map[head]=true;
    cur = nodes[head].next;
    prev = head;

怎么样,还觉得考虑所有情况简单么😉